ChengsongTanPhdThesis/Chapters/Bitcoded2.tex
author Chengsong
Wed, 23 Aug 2023 03:02:31 +0100
changeset 668 3831621d7b14
parent 659 2e05f04ed6b3
permissions -rwxr-xr-x
added technical Overview section, almost done introduction
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     1
% Chapter Template
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     2
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     3
% Main chapter title
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     4
\chapter{Correctness of Bit-coded Algorithm with Simplification}
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     5
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     6
\label{Bitcoded2} % Change X to a consecutive number; for referencing this chapter elsewhere, use \ref{ChapterX}
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     7
%Then we illustrate how the algorithm without bitcodes falls short for such aggressive 
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     8
%simplifications and therefore introduce our version of the bitcoded algorithm and 
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     9
%its correctness proof in 
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
    10
%Chapter 3\ref{Chapter3}. 
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    11
%\section{Overview}
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    12
\marginpar{\em Added a completely new \\overview section, \\highlighting\\ contributions.}
649
Chengsong
parents: 640
diff changeset
    13
Chengsong
parents: 640
diff changeset
    14
This chapter
Chengsong
parents: 640
diff changeset
    15
is the point from which novel contributions of this PhD project are introduced
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    16
in detail. 
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    17
The material in the
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    18
previous
654
Chengsong
parents: 653
diff changeset
    19
chapters is necessary for this thesis,
Chengsong
parents: 653
diff changeset
    20
because it provides the context for why we need a new framework for
Chengsong
parents: 653
diff changeset
    21
the proof of $\blexersimp$.
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    22
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    23
We will first introduce why aggressive simplifications are needed, after which we
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    24
provide our algorithm, contrasting with Sulzmann and Lu's simplifications.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    25
We then explain how our simplifications make
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    26
reusing $\blexer$'s correctness proof impossible.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    27
%with some minor modifications
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    28
We discuss possible fixes such as rectification functions and then introduce our proof, 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    29
which involves a weaker inductive
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    30
invariant than that used in the correctness proof of $\blexer$.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    31
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    32
\marginpar{Shortened overview.}
654
Chengsong
parents: 653
diff changeset
    33
%material for setting the scene of the formal proof we
Chengsong
parents: 653
diff changeset
    34
%are about to describe.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
    35
\section{Simplifications by Sulzmann and Lu}
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    36
\marginpar{moved \\simplification \\section to front \\to make coherent\\ sense.}
649
Chengsong
parents: 640
diff changeset
    37
The algorithms $\lexer$ and $\blexer$ work beautifully as functional 
Chengsong
parents: 640
diff changeset
    38
programs, but not as practical code. One main reason for the slowness is due
Chengsong
parents: 640
diff changeset
    39
to the size of intermediate representations--the derivative regular expressions
Chengsong
parents: 640
diff changeset
    40
tend to grow unbounded if the matching involved a large number of possible matches.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    41
Consider the derivatives of the following example $(a^*a^*)^*$:
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    42
%and $(a^* + (aa)^*)^*$:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
    43
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    44
	\begin{tabular}{lcl}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    45
		$(a^*a^*)^*$ & $ \stackrel{\backslash a}{\longrightarrow}$ & 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    46
		$ (a^*a^* + a^*)\cdot(a^*a^*)^*$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    47
			     & 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    48
		$ \stackrel{\backslash a}{\longrightarrow} $ & 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    49
	$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    50
							     & $\stackrel{\backslash a}{
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    51
	\longrightarrow} $ & $\ldots$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    52
	\end{tabular}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
    53
\end{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
    54
\noindent
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    55
From the second derivative several duplicate sub-expressions 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    56
already needs to be eliminated (possible
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    57
bitcodes are omitted to make the presentation more concise
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
    58
because they are not the key part of the simplifications).
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
    59
A simple-minded simplification function cannot simplify
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
    60
the third regular expression in the above chain of derivative
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
    61
regular expressions, namely
583
Chengsong
parents: 582
diff changeset
    62
\begin{center}
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    63
$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$
583
Chengsong
parents: 582
diff changeset
    64
\end{center}
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
    65
because the duplicates are
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
    66
not next to each other, and therefore the rule
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
    67
$r+ r \rightarrow r$ from $\textit{simp}$ does not fire.
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
    68
One would expect a better simplification function to work in the 
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    69
following way:
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    70
\begin{gather*}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    71
	((a^*a^* + \underbrace{a^*}_\text{A})+\underbrace{a^*}_\text{duplicate of A})\cdot(a^*a^*)^* + 
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    72
	\underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this}.\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    73
	\bigg\downarrow (1) \\
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    74
	(a^*a^* + a^* 
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    75
	\color{gray} + a^* \color{black})\cdot(a^*a^*)^* + 
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    76
	\underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this} \\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    77
	\bigg\downarrow (2) \\
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    78
	(a^*a^* + a^* 
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    79
	)\cdot(a^*a^*)^*  
583
Chengsong
parents: 582
diff changeset
    80
	\color{gray} + (a^*a^* + a^*) \cdot(a^*a^*)^*\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    81
	\bigg\downarrow (3) \\
583
Chengsong
parents: 582
diff changeset
    82
	(a^*a^* + a^* 
Chengsong
parents: 582
diff changeset
    83
	)\cdot(a^*a^*)^*  
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    84
\end{gather*}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    85
\noindent
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
    86
In the first step, the nested alternative regular expression
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
    87
$(a^*a^* + a^*) + a^*$ is flattened into $a^*a^* + a^* + a^*$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    88
Now the third term $a^*$ can clearly be identified as a duplicate
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    89
and therefore removed in the second step. 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    90
This causes the two
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
    91
top-level terms to become the same and the second $(a^*a^*+a^*)\cdot(a^*a^*)^*$ 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    92
removed in the final step.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    93
Sulzmann and Lu's simplification function (using our notations) can achieve this
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    94
simplification:
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    95
\begin{center}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    96
	\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
    97
		$\textit{simp}\_{SL} \; _{bs}(_{bs'}\ONE \cdot r)$ & $\dn$ & 
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    98
		$\textit{if} \; (\textit{zeroable} \; r)\; \textit{then} \;\; \ZERO$\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
    99
						   & &$\textit{else}\;\; \fuse \; (bs@ bs') \; r$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   100
		$\textit{simp}\_{SL} \;(_{bs}r_1\cdot r_2)$ & $\dn$ & $\textit{if} 
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   101
		\; (\textit{zeroable} \; r_1 \; \textit{or} \; \textit{zeroable}\; r_2)\;
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   102
		\textit{then} \;\; \ZERO$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   103
							    & & $\textit{else}\;\;_{bs}((\textit{simp}\_{SL} \;r_1)\cdot
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   104
							    (\textit{simp}\_{SL} \; r_2))$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   105
		$\textit{simp}\_{SL}  \; _{bs}\sum []$ & $\dn$ & $\ZERO$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   106
		$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   107
		$_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   108
		$\textit{simp}\_{SL}  \; _{bs}\sum[r]$ & $\dn$ & $\fuse \; bs \; (\textit{simp}\_{SL}  \; r)$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   109
		$\textit{simp}\_{SL}  \; _{bs}\sum(r::rs)$ & $\dn$ & $_{bs}\sum 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   110
		(\nub \; (\filter \; (\neg\zeroable)\;((\textit{simp}\_{SL}  \; r) :: \map \; \textit{simp}\_{SL}  \; rs)))$\\ 
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   111
		
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   112
	\end{tabular}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   113
\end{center}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   114
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   115
The $\textit{zeroable}$ predicate 
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   116
tests whether the regular expression
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   117
is equivalent to $\ZERO$, and
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   118
can be defined as:
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   119
\begin{center}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   120
	\begin{tabular}{lcl}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   121
		$\zeroable \; _{bs}\sum (r::rs)$ & $\dn$ & $\zeroable \; r\;\; \land \;\;
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   122
		\zeroable \;_{[]}\sum\;rs $\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   123
		$\zeroable\;_{bs}(r_1 \cdot r_2)$ & $\dn$ & $\zeroable\; r_1 \;\; \lor \;\; \zeroable \; r_2$\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   124
		$\zeroable\;_{bs}r^*$ & $\dn$ & $\textit{false}$ \\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   125
		$\zeroable\;_{bs}c$ & $\dn$ & $\textit{false}$\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   126
		$\zeroable\;_{bs}\ONE$ & $\dn$ & $\textit{false}$\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   127
		$\zeroable\;_{bs}\ZERO$ & $\dn$ & $\textit{true}$
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   128
	\end{tabular}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   129
\end{center}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   130
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   131
The 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   132
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   133
	\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   134
		$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   135
		$_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   136
	\end{tabular}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   137
\end{center}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   138
\noindent
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   139
clause does flatten the alternative as required in step (1),
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   140
but $\textit{simp}\_{SL}$ is insufficient if we want to do steps (2) and (3),
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   141
as these ``identical'' terms have different bit-annotations.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   142
They also suggested that the $\textit{simp}\_{SL} $ function should be
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   143
applied repeatedly until a fixpoint is reached.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   144
We call this construction $\textit{SLSimp}$:
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   145
\begin{center}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   146
	\begin{tabular}{lcl}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   147
		$\textit{SLSimp} \; r$ & $\dn$ & 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   148
		$\textit{while}((\textit{simp}\_{SL}  \; r)\; \cancel{=} \; r)$ \\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   149
					 & & $\quad r := \textit{simp}\_{SL}  \; r$\\
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   150
		& & $\textit{return} \; r$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   151
	\end{tabular}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   152
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   153
We call the operation of alternatingly 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   154
applying derivatives and simplifications
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   155
(until the string is exhausted) Sulz-simp-derivative,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   156
written $\backslash_{SLSimp}$:
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   157
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   158
\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   159
	$r \backslash_{SLSimp} (c\!::\!s) $ & $\dn$ & $(\textit{SLSimp} \; (r \backslash c)) \backslash_{SLSimp}\, s$ \\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   160
$r \backslash_{SLSimp} [\,] $ & $\dn$ & $r$
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   161
\end{tabular}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   162
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   163
\noindent
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   164
After the derivatives have been taken, the bitcodes
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   165
are extracted and decoded in the same manner
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   166
as $\blexer$:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   167
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   168
\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   169
  $\textit{blexer\_SLSimp}\;r\,s$ & $\dn$ &
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   170
      $\textit{let}\;a = (r^\uparrow)\backslash_{SLSimp}\, s\;\textit{in}$\\                
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   171
  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   172
  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   173
  & & $\;\;\textit{else}\;\textit{None}$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   174
\end{tabular}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   175
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   176
\noindent
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   177
We implemented this lexing algorithm in Scala, 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   178
and found that the final derivative regular expression
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   179
size still grows exponentially (note the logarithmic scale):
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   180
\begin{figure}[H]
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   181
	\centering
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   182
\begin{tikzpicture}
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   183
\begin{axis}[
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   184
    xlabel={$n$},
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   185
    ylabel={size},
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   186
    ymode = log,
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   187
    legend entries={Final Derivative Size},  
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   188
    legend pos=north west,
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   189
    legend cell align=left]
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   190
\addplot[red,mark=*, mark options={fill=white}] table {SulzmannLuLexer.data};
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   191
\end{axis}
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   192
\end{tikzpicture} 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   193
\caption{Lexing the regular expression $(a^*a^*)^*$ against strings of the form
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   194
$\protect\underbrace{aa\ldots a}_\text{n \textit{a}s}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   195
$ using Sulzmann and Lu's lexer}\label{SulzmannLuLexer}
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   196
\end{figure}
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   197
\noindent
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   198
At $n= 20$ we already get an out-of-memory error with Scala's normal 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   199
JVM heap size settings.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   200
In fact their simplification does not improve much over
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   201
the simple-minded simplifications we have shown in \ref{fig:BetterWaterloo}.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   202
The time required also grows exponentially:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   203
\begin{figure}[H]
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   204
	\centering
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   205
\begin{tikzpicture}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   206
\begin{axis}[
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   207
    xlabel={$n$},
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   208
    ylabel={time},
601
Chengsong
parents: 600
diff changeset
   209
    %ymode = log,
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   210
    legend entries={time in secs},  
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   211
    legend pos=north west,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   212
    legend cell align=left]
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   213
\addplot[red,mark=*, mark options={fill=white}] table {SulzmannLuLexerTime.data};
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   214
\end{axis}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   215
\end{tikzpicture} 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   216
\caption{Lexing the regular expression $(a^*a^*)^*$ against strings of the form
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   217
$\protect\underbrace{aa\ldots a}_\text{n \textit{a}s}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   218
$ using Sulzmann and Lu's lexer}\label{SulzmannLuLexerTime}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   219
\end{figure}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   220
\noindent
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   221
which seems like a counterexample for 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   222
Sulzmann and Lu's linear complexity claim
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   223
in their paper \cite{Sulzmann2014}:
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   224
\begin{quote}\it
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   225
``Linear-Time Complexity Claim \\It is easy to see that each call of one of the functions/operations:
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   226
simp, fuse, mkEpsBC and isPhi leads to subcalls whose number is bound by the size of the regular expression involved. We claim that thanks to aggressively applying simp this size remains finite. Hence, we can argue that the above mentioned functions/operations have constant time complexity which implies that we can incrementally compute bit-coded parse trees in linear time in the size of the input.'' 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   227
\end{quote}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   228
\noindent
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   229
The assumption that the size of the regular expressions
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   230
in the algorithm
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   231
would stay below a finite constant is not true, at least not in the
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   232
examples we considered.
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   233
The main reason behind this is that (i) Haskell's $\textit{nub}$
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   234
function requires identical annotations between two 
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   235
annotated regular expressions to qualify as duplicates,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   236
and therefore cannot simplify cases like $_{SZZ}a^*+_{SZS}a^*$
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   237
even if both $a^*$ denote the same language, and
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   238
(ii) the ``flattening'' only applies to the head of the list
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   239
in the 
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   240
\begin{center}
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   241
	\begin{tabular}{lcl}
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   242
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   243
		$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   244
		$_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   245
	\end{tabular}
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   246
\end{center}
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   247
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   248
clause, and therefore is not strong enough to simplify all
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   249
needed parts of the regular expression. Moreover,
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   250
the $\textit{simp}\_{SL}$ function is applied repeatedly
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   251
in each derivative step until a fixed point is reached, 
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   252
which makes the algorithm even more
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   253
unpredictable and inefficient.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   254
%To not get ``caught off guard'' by
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   255
%these counterexamples,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   256
%one needs to be more careful when designing the
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   257
%simplification function and making claims about them.
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   258
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   259
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   260
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   261
\section{Our $\textit{Simp}$ Function}
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   262
We will now introduce our own simplification function.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   263
%by making a contrast with $\textit{simp}\_{SL}$.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   264
We also describe
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   265
the ideas behind Sulzmann and Lu's $\textit{simp}\_{SL}$
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   266
algorithm 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   267
and why it fails to achieve the desired effect of keeping the sizes of derivatives finitely bounded. 
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   268
In addition, our simplification function will come with a formal
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   269
correctness proof.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   270
\subsection{Flattening Nested Alternatives}
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   271
The idea behind the clause
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   272
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   273
	$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2) \quad \dn \quad
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   274
	       _{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   275
\end{center}
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   276
is that it allows
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   277
duplicate removal of regular expressions at different
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   278
``levels'' of alternatives.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   279
For example, this would help with the
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   280
following simplification:
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   281
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   282
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   283
$(a+r)+r \longrightarrow a+r$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   284
\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   285
The problem is that only the head element
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   286
is ``spilled out''.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   287
It is more desirable
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   288
to flatten
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   289
an entire list to open up possibilities for further simplifications
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   290
with later regular expressions.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   291
Not flattening the rest of the elements also means that
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   292
the later de-duplication process 
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   293
does not fully remove further duplicates.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   294
For example,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   295
using $\textit{simp}\_{SL}$ we cannot
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   296
simplify
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   297
\begin{center}
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   298
	$((a^* a^*)+\underline{(a^* + a^*)})\cdot (a^*a^*)^*+
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   299
((a^*a^*)+a^*)\cdot (a^*a^*)^*$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   300
\end{center}
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   301
due to the underlined part not being the head 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   302
of the alternative.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   303
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   304
We define our flatten operation so that it flattens 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   305
the entire list: 
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   306
 \begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   307
  \begin{tabular}{@{}lcl@{}}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   308
  $\textit{flts} \; (_{bs}\sum \textit{as}) :: \textit{as'}$ & $\dn$ & $(\textit{map} \;
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   309
     (\textit{fuse}\;bs)\; \textit{as}) \; @ \; \textit{flts} \; as' $ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   310
  $\textit{flts} \; \ZERO :: as'$ & $\dn$ & $ \textit{flts} \;  \textit{as'} $ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   311
    $\textit{flts} \; a :: as'$ & $\dn$ & $a :: \textit{flts} \; \textit{as'}$ \quad(otherwise) 
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   312
\end{tabular}    
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   313
\end{center}  
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   314
\noindent
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   315
Our $\flts$ operation 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   316
also throws away $\ZERO$s
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   317
as they do not contribute to a lexing result.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   318
\subsection{Duplicate Removal}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   319
After flattening is done, we can deduplicate.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   320
The de-duplicate function is called $\distinctBy$,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   321
and that is where we make our second improvement over
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   322
Sulzmann and Lu's simplification method.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   323
The process goes as follows:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   324
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   325
$rs \stackrel{\textit{flts}}{\longrightarrow} 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   326
rs_{flat} 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   327
\xrightarrow{\distinctBy \; 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   328
rs_{flat} \; \rerases\; \varnothing} rs_{distinct}$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   329
%\stackrel{\distinctBy \; 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   330
%rs_{flat} \; \erase\; \varnothing}{\longrightarrow} \; rs_{distinct}$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   331
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   332
where the $\distinctBy$ function is defined as:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   333
\begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   334
	\begin{tabular}{@{}lcl@{}}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   335
		$\distinctBy \; [] \; f\; acc $ & $ =$ & $ []$\\
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   336
		$\distinctBy \; (x :: xs) \; f \; acc$ & $=$ & $\quad \textit{if} (f \; x \in acc)\;\; \textit{then} \;\; \distinctBy \; xs \; f \; acc$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   337
						       & & $\quad \textit{else}\;\; x :: (\distinctBy \; xs \; f \; (\{f \; x\} \cup acc))$ 
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   338
	\end{tabular}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   339
\end{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   340
\noindent
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   341
The reason we define a distinct function under a mapping $f$ is because
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   342
we want to eliminate regular expressions that are syntactically the same,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   343
but have different bit-codes.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   344
For example, we can remove the second $a^*a^*$ from
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   345
$_{ZSZ}a^*a^* + _{SZZ}a^*a^*$, because it
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   346
represents a match with shorter initial sub-match 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   347
(and therefore is definitely not POSIX),
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   348
and will be discarded by $\bmkeps$ later.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   349
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   350
	$_{ZSZ}\underbrace{a^*}_{ZS:\; match \; 1\; times\quad}\underbrace{a^*}_{Z: \;match\; 1 \;times} + 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   351
	_{SZZ}\underbrace{a^*}_{S: \; match \; 0 \; times\quad}\underbrace{a^*}_{ZZ: \; match \; 2 \; times}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   352
	$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   353
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   354
%$_{bs1} r_1 + _{bs2} r_2 \text{where} (r_1)_{\downarrow} = (r_2)_{\downarrow}$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   355
Due to the way our algorithm works,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   356
the matches that conform to the POSIX standard 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   357
will always be placed further to the left. When we 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   358
traverse the list from left to right,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   359
regular expressions we have already seen
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   360
will definitely not contribute to a POSIX value,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   361
even if they are attached with different bitcodes.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   362
These duplicates therefore need to be removed.
659
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment
Chengsong
parents: 658
diff changeset
   363
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment
Chengsong
parents: 658
diff changeset
   364
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   365
To achieve this, we call $\rerases$ as the function $f$ during the distinction
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   366
operation. The function
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   367
$\rerases$ is very similar to $\erase$, except that it preserves the structure
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   368
when erasing an alternative regular expression.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   369
The reason why we use $\rerases$ instead of $\erase$ is that
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   370
it keeps the structures of alternative 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   371
annotated regular expressions
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   372
whereas $\erase$ would turn it back into a binary  tree structure.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   373
Not having to mess with the structure 
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   374
greatly simplifies the finiteness proof in chapter 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   375
\ref{Finite}.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   376
We give the definitions of $\rerases$ here together with
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   377
the new datatype used by $\rerases$ (as our plain
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   378
regular expression datatype does not allow non-binary alternatives).
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   379
For now we can think of 
659
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment
Chengsong
parents: 658
diff changeset
   380
$\rerases$ as the function erase ($(\_)_\downarrow$) defined in chapter \ref{Bitcoded1}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   381
and $\rrexp$ as plain regular expressions, but having a general list constructor
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   382
for alternatives:
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   383
\begin{figure}[H]
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   384
\begin{center}	
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   385
	$\rrexp ::=   \RZERO \mid  \RONE
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   386
			 \mid  \RCHAR{c}  
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   387
			 \mid  \RSEQ{r_1}{r_2}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   388
			 \mid  \RALTS{rs}
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   389
			 \mid \RSTAR{r}        $
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   390
\end{center}
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   391
\caption{$\rrexp$: plain regular expressions, but with $\sum$ alternative 
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   392
constructor}\label{rrexpDef}
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   393
\end{figure}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   394
The function $\rerases$ we define as follows:
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   395
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   396
\begin{tabular}{lcl}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   397
$\rerase{\ZERO}$ & $\dn$ & $\RZERO$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   398
$\rerase{_{bs}\ONE}$ & $\dn$ & $\RONE$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   399
	$\rerase{_{bs}\mathbf{c}}$ & $\dn$ & $\RCHAR{c}$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   400
$\rerase{_{bs}r_1\cdot r_2}$ & $\dn$ & $\RSEQ{\rerase{r_1}}{\rerase{r_2}}$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   401
$\rerase{_{bs}\sum as}$ & $\dn$ & $\RALTS{\map \; \rerase{\_} \; as}$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   402
$\rerase{_{bs} a ^*}$ & $\dn$ & $\rerase{a}^*$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   403
\end{tabular}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   404
\end{center}
659
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment
Chengsong
parents: 658
diff changeset
   405
We will provide more details in \ref{whyRerase} for why a new
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment
Chengsong
parents: 658
diff changeset
   406
erase function and new datatype is needed. But briefly speaking
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment
Chengsong
parents: 658
diff changeset
   407
it is for backward-compatibility with $\blexer$'s correctness proof and 
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment
Chengsong
parents: 658
diff changeset
   408
the path we (naturally) took during our proof engineering of the finiteness property.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   409
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   410
\subsection{Putting Things Together}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   411
We can now give the definition of our  simplification function:
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   412
%that looks somewhat similar to our Scala code is 
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   413
\begin{center}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   414
  \begin{tabular}{@{}lcl@{}}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   415
   
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   416
	  $\textit{bsimp} \; (_{bs}a_1\cdot a_2)$ & $\dn$ & $ \textit{bsimp}_{ASEQ} \; bs \;(\textit{bsimp} \; a_1) \; (\textit{bsimp}  \; a_2)  $ \\
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   417
	  $\textit{bsimp} \; (_{bs}\sum \textit{as})$ & $\dn$ & $\textit{bsimp}_{ALTS} \; \textit{bs} \; (\textit{distinctBy} \; ( \textit{flatten} ( \textit{map} \; bsimp \; as)) \; \rerases \; \varnothing) $ \\
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   418
   $\textit{bsimp} \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$   
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   419
\end{tabular}    
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   420
\end{center}    
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   421
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   422
\noindent
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   423
The simplification (named $\textit{bsimp}$ for \emph{b}it-coded) 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   424
does a pattern matching on the regular expression.
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   425
When it detects that the regular expression is an alternative or
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   426
sequence, it will try to simplify its children regular expressions
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   427
recursively and then see if one of the children turns into $\ZERO$ or
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   428
$\ONE$, which might trigger further simplification at the current level.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   429
Current level simplifications are handled by the function $\textit{bsimp}_{ASEQ}$,
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   430
using rules such as  $\ZERO \cdot r \rightarrow \ZERO$ and $\ONE \cdot r \rightarrow r$.
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   431
\begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   432
	\begin{tabular}{@{}lcl@{}}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   433
		$\textit{bsimp}_{ASEQ} \; bs\; a \; b$ & $\dn$ & $ (a,\; b) \textit{match}$\\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   434
   &&$\quad\textit{case} \; (\ZERO, \_) \Rightarrow  \ZERO$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   435
   &&$\quad\textit{case} \; (\_, \ZERO) \Rightarrow  \ZERO$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   436
   &&$\quad\textit{case} \;  (_{bs1}\ONE, a_2') \Rightarrow  \textit{fuse} \; (bs@bs_1) \;  a_2'$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   437
   &&$\quad\textit{case} \; (a_1', a_2') \Rightarrow   _{bs}a_1' \cdot a_2'$ 
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   438
	\end{tabular}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   439
\end{center}
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   440
\noindent
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   441
The most involved part is the $\sum$ clause, where we first call $\flts$ on
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   442
the simplified children regular expression list $\textit{map}\; \textit{bsimp}\; \textit{as}$,
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   443
and then call $\distinctBy$ on that list. The predicate used in $\distinctBy$ for determining whether two 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   444
elements are the same is $\rerases \; r_1 = \rerases\; r_2$.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   445
Finally, depending on whether the regular expression list $as'$ has turned into a
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   446
singleton or empty list after $\flts$ and $\distinctBy$, $\textit{bsimp}_{ALTS}$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   447
decides whether to keep the current level constructor $\sum$ as it is, and 
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   448
removes it when there are fewer than two elements:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   449
\begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   450
	\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   451
		$\textit{bsimp}_{ALTS} \; bs \; as'$ & $ \dn$ & $ as' \; \textit{match}$\\		
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   452
  &&$\quad\textit{case} \; [] \Rightarrow  \ZERO$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   453
   &&$\quad\textit{case} \; a :: [] \Rightarrow  \textit{fuse bs a}$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   454
   &&$\quad\textit{case} \;  as' \Rightarrow _{bs}\sum \textit{as'}$\\ 
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   455
	\end{tabular}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   456
	
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   457
\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   458
Having defined the $\textit{bsimp}$ function,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   459
we add it as a phase after a derivative is taken.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   460
\begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   461
	\begin{tabular}{lcl}
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   462
		$a \backslash_{bsimp} c$ & $\dn$ & $\textit{bsimp}(a \backslash c)$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   463
	\end{tabular}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   464
\end{center}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   465
%Following previous notations
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   466
%when extending from derivatives w.r.t.~character to derivative
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   467
%w.r.t.~string, we define the derivative that nests simplifications 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   468
%with derivatives:%\comment{simp in  the [] case?}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   469
We extend this from characters to strings:
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   470
\begin{center}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   471
\begin{tabular}{lcl}
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   472
$a \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(a \backslash_{bsimp}\, c) \backslash_{bsimps}\, s$ \\
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   473
$a \backslash_{bsimps} [\,] $ & $\dn$ & $a$
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   474
\end{tabular}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   475
\end{center}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   476
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   477
\noindent
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   478
The lexer that extracts bitcodes from the 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   479
derivatives with simplifications from our $\simp$ function
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   480
is called $\blexersimp$:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   481
\begin{center}
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   482
\begin{tabular}{lcl}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   483
  $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   484
      $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\                
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   485
  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   486
  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   487
  & & $\;\;\textit{else}\;\textit{None}$
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   488
\end{tabular}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   489
\end{center}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   490
\noindent
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   491
This algorithm keeps the regular expression size small, 
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   492
as we shall demonstrate with some examples in the next section.
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   493
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   494
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   495
\subsection{Examples $(a+aa)^*$ and $(a^*\cdot a^*)^*$
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   496
After Simplification}
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   497
Recall the
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   498
previous $(a^*a^*)^*$ example
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   499
where $\textit{simp}\_{SL}$ could not
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   500
prevent the fast growth (over
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   501
3 million nodes just below $20$ input length)
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   502
will be reduced to just 15 and stays constant no matter how long the
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   503
input string is.
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   504
This is shown in the graphs below.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   505
\begin{figure}[H]
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   506
\begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   507
\begin{tabular}{ll}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   508
\begin{tikzpicture}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   509
\begin{axis}[
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   510
    xlabel={$n$},
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   511
    ylabel={derivative size},
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   512
        width=7cm,
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   513
    height=4cm, 
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   514
    legend entries={Lexer with $\textit{bsimp}$},  
539
Chengsong
parents: 538
diff changeset
   515
    legend pos=  south east,
Chengsong
parents: 538
diff changeset
   516
    legend cell align=left]
Chengsong
parents: 538
diff changeset
   517
\addplot[red,mark=*, mark options={fill=white}] table {BitcodedLexer.data};
Chengsong
parents: 538
diff changeset
   518
\end{axis}
Chengsong
parents: 538
diff changeset
   519
\end{tikzpicture} %\label{fig:BitcodedLexer}
Chengsong
parents: 538
diff changeset
   520
&
Chengsong
parents: 538
diff changeset
   521
\begin{tikzpicture}
Chengsong
parents: 538
diff changeset
   522
\begin{axis}[
Chengsong
parents: 538
diff changeset
   523
    xlabel={$n$},
Chengsong
parents: 538
diff changeset
   524
    ylabel={derivative size},
Chengsong
parents: 538
diff changeset
   525
    width = 7cm,
Chengsong
parents: 538
diff changeset
   526
    height = 4cm,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   527
    legend entries={Lexer with $\textit{simp}\_{SL}$},  
539
Chengsong
parents: 538
diff changeset
   528
    legend pos=  north west,
Chengsong
parents: 538
diff changeset
   529
    legend cell align=left]
Chengsong
parents: 538
diff changeset
   530
\addplot[red,mark=*, mark options={fill=white}] table {BetterWaterloo.data};
Chengsong
parents: 538
diff changeset
   531
\end{axis}
Chengsong
parents: 538
diff changeset
   532
\end{tikzpicture} 
Chengsong
parents: 538
diff changeset
   533
\end{tabular}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   534
\end{center}
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   535
\caption{Our Improvement over Sulzmann and Lu's in terms of size of the derivatives.}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   536
\end{figure}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   537
\noindent
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   538
Given the size difference, it is not
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   539
surprising that our $\blexersimp$ significantly outperforms
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   540
$\textit{blexer\_SLSimp}$ by Sulzmann and Lu.
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   541
Indeed $\blexersimp$ seems to be a correct algorithm that effectively
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   542
bounds the size of intermediate representations.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   543
\marginpar{\em more connecting material to make narration more coherent.}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   544
As promised we will use formal proofs to show that our speculation
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   545
based on these experimental results indeed hold.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   546
%intuitions indeed hold.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   547
In the next section we are going to establish that our
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   548
simplification preserves the correctness of the algorithm.
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   549
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   550
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   551
\section{Correctness of $\blexersimp$}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   552
A natural thought would be to directly re-use the formal
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   553
proof of $\blexer$'s correctness, with some minor modifications
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   554
but keeping the way the induction is done.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   555
However we were not able to find a simple way to re-factor
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   556
proof of \ref{blexerCorrect} in chapter \ref{Bitcoded1}.
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   557
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   558
\subsection{Why $\textit{Blexer}$'s Proof Does Not Work}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   559
The fundamental reason is %we cannot extend the correctness proof of theorem 4
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   560
because lemma \ref{retrieveStepwise} does not hold 
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   561
anymore when simplifications are involved.
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   562
\marginpar{\em rephrased things \\so why new \\proof makes sense.}
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   563
%The proof details are necessary materials for this thesis
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   564
%because it provides necessary context to explain why we need a
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   565
%new framework for the proof of $\blexersimp$, which involves
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   566
%simplifications that cause structural changes to the regular expression.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   567
%A new formal proof of the correctness of $\blexersimp$, where the 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   568
%proof of $\blexer$
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   569
%is not applicatble in the sense that we cannot straightforwardly extend the
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   570
%proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   571
%not hold anymore.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   572
%This is because the structural induction on the stepwise correctness
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   573
%of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   574
%in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   575
%each other.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   576
%In this chapter we introduce simplifications
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   577
%for annotated regular expressions that can be applied to 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   578
%each intermediate derivative result. This allows
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   579
%us to make $\blexer$ much more efficient.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   580
%Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   581
%but their simplification functions could have been more efficient and in some cases needed fixing.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   582
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   583
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   584
In particular, the correctness theorem 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   585
of the un-optimised bit-coded lexer $\blexer$ in 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   586
chapter \ref{Bitcoded1} formalised by Ausaf et al.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   587
relies crucially on lemma \ref{retrieveStepwise} that says
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   588
any value can be retrieved in a stepwise manner, namely:
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   589
\begin{equation}\label{eq:stepwise}%eqref: this proposition needs to be referred	
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   590
	\vdash v : ((a_\downarrow) \backslash c) \implies \retrieve \; (a \backslash c)  \;  v= \retrieve \; a \; (\inj \; (a_\downarrow) \; c\; v)
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   591
\end{equation}
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   592
%This no longer holds once we introduce simplifications.
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   593
The regular expressions $a$ and $a\backslash c$ correspond to the intermediate
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   594
result before and after the derivative with respect to $c$, and similarly
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   595
$\inj\; a_\downarrow \; c \; v$ and $v$ correspond to the value before and after the derivative.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   596
They go in lockstep pairs
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   597
\[
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   598
	(a, \; \inj\; a_\downarrow \; c \; v)\; \text{and} \; (a\backslash c,\; v)
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   599
\]
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   600
and the structure of annotated regular expression and 
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   601
value within a pair always align with each other.
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   602
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   603
As $\blexersimp$ integrates
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   604
$\textit{bsimp}$ by applying it after each call to the derivatives function,
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   605
%Simplifications are necessary to control the size of derivatives,
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   606
%but they also destroy the structures of the regular expressions
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   607
%such that \ref{eq:stepwise} does not hold.
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   608
\begin{center}
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   609
\begin{tabular}{lcl}
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   610
	$a \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (a \backslash\, c)) \backslash_{bsimps}\, s$ \\
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   611
%$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   612
\end{tabular}
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   613
%\begin{tabular}{lcl}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   614
%  $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   615
%      $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\                
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   616
%  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   617
%  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   618
%  & & $\;\;\textit{else}\;\textit{None}$
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   619
%\end{tabular}
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   620
\end{center}
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   621
\noindent
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   622
it becomes a problem to maintain a similar property as \ref{retrieveStepwise}.
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   623
Previously without $\textit{bsimp}$ the exact structure of each intermediate 
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   624
regular expression is preserved, 
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   625
%allowing pairs of inhabitation relations 
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   626
%in the form $\vdash v : r \backslash c $ and
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   627
%$\vdash \inj \; r\; c \; v : r $ to hold in \ref{eq:stepwise}.
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   628
We can illustrate this using the diagram \ref{fig:inj} in chapter \ref{Inj},
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   629
by zooming in to the middle bit involving $r_i$, $r_{i+1}$, $v_i$ and $v_{i+1}$,
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   630
and adding the bottom row to show how bitcodes encoding the lexing information
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   631
can be extracted from every pair $(r_i, \; v_i)$:
658
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   632
\begin{center}\label{graph:injZoom}
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   633
	\begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   634
		%\node [rectangle ] (1)  at (-7, 2) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   635
		%\node [rectangle, draw] (2) at  (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   636
		%\node [rectangle, draw] (3) at  (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   637
		%\node [rectangle] (4) at  (9, 2) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   638
		%\node [rectangle] (5) at  (-7, -2) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   639
		%\node [rectangle, draw] (6) at  (-4, -2) {$v_i = \Stars \; [\Left (a)]$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   640
		%\node [rectangle, draw] (7) at  ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   641
		%\node [rectangle] (8) at  ( 9, -2) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   642
		%\node [rectangle] (9) at  (-7, -6) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   643
		%\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = bs' @ ZZS$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   644
		%\node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   645
		%\node [rectangle] (12) at  (9, -6) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   646
		
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   647
		\node [rectangle ] (1)  at (-8, 2) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   648
		\node [rectangle, draw] (2) at  (-5, 2) {$r_i = a_\downarrow$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   649
		\node [rectangle, draw] (3) at  (3, 2) {$r_{i+1} = (a\backslash c)_\downarrow$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   650
		\node [rectangle] (4) at  (8, 2) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   651
		\node [rectangle] (5) at  (-8, -2) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   652
		\node [rectangle, draw] (6) at  (-5, -2) {$v_i = \inj\; r \; c \; v$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   653
		\node [rectangle, draw] (7) at  ( 3, -2) {$v_{i+1} = v$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   654
		\node [rectangle] (8) at  ( 8, -2) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   655
		\node [rectangle] (9) at  (-8, -6) {$\ldots$};
658
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   656
		\node [rectangle, draw] (10) at (-5, -6) {$\textit{bits}_{i} $};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   657
		\node [rectangle, draw] (11) at (3, -6) {$\textit{bits}_{i+1}$};
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   658
		\node [rectangle] (12) at  (8, -6) {$\ldots$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   659
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   660
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   661
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   662
		\path (1) edge [] node {} (2);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   663
		\path (6) edge [] node {} (5);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   664
		\path (9) edge [] node {} (10);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   665
658
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   666
		\path (11) edge [<-] node {} (12);
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   667
		\path (8) edge [] node {} (7);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   668
		\path (3) edge [] node {} (4);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   669
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   670
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   671
		\path (6) edge [dashed,bend right = 30] node {$\retrieve \; a_i \; v_i$} (10);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   672
		\path (2) edge [dashed,bend left = 48] node {} (10);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   673
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   674
		\path (7) edge [dashed,bend right = 30] node {$\retrieve \; a_{i+1} \; v_{i+1}$} (11);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   675
		\path (3) edge [dashed,bend left = 45] node {} (11);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   676
	
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   677
		\path (2) edge [] node {$\backslash c$} (3);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   678
		\path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   679
		\path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   680
		%\path (6) edge [] node {$\vdash v_i : r_i$} (10);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   681
		%\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   682
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   683
		\path (10) edge [dashed, <->] node {$=$} (11);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   684
658
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   685
		\path (7) edge [] node {$\inj \; r_{i+1} \; c \; v_i$} (6);
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   686
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   687
%		\node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   688
%		\node [rectangle, draw] (a) at (-6, 4)	  {$(aa)^*(_{Z}b + _{S}c)$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   689
%		\path	(r)
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   690
%			edge [] node {$\internalise$} (a);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   691
%		\node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   692
%		\path	(a)
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   693
%			edge [] node {$\backslash a$} (a1);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   694
%
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   695
%		\node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   696
%		\node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   697
%		\path	(a1)
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   698
%			edge [] node {$\backslash a$} (a21);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   699
%		\node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   700
%		\path	(a22)
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   701
%			edge [] node {$\backslash b$} (a3);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   702
%		\path	(a21)
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   703
%			edge [dashed, bend right] node {} (a3);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   704
%		\node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   705
%		\path	(a3)
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   706
%			edge [below] node {$\bmkeps$} (bs);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   707
%		\node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   708
%		\path 	(bs)
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   709
%			edge [] node {$\decode$} (v);
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   710
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   711
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   712
	\end{tikzpicture}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   713
	%\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   714
\end{center}
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   715
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   716
But $\blexersimp$ introduces simplification after the derivative,
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   717
making it difficult to align the structures of values and regular expressions.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   718
If we change the form of property \ref{eq:stepwise} to 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   719
adapt to the needs of $\blexersimp$ the precondition of becomes
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   720
\[
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   721
	\vdash v : (\textit{bsimp} \; (r\backslash c))
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   722
\]
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   723
The inhabitation relation of the other pair no longer holds,
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   724
because $\inj$ does not work on the simplified value $v$ 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   725
and the unsimplified regular expression $r$.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   726
The retrieve function will not work either.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   727
\[
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   728
	\vdash \inj \; r \; c \; v : r
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   729
\]
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   730
It seems unclear what procedures needs to be
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   731
used to create a new value $v_?$ such that
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   732
\[
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   733
	\vdash v_? : r \; \text{and} \; \retrieve \; r \; v_?   = \retrieve \; (\textit{bsimp} \; (r\backslash c)) \; v
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   734
\]
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   735
hold.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   736
%It is clear that once we made 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   737
%$v$ to align with $\textit{bsimp} \; r_{c}$
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   738
%in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   739
%in for the above statement to hold.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   740
Ausaf et al. \cite{AusafDyckhoffUrban2016}
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   741
used something they call rectification functions to restore the original value from the simplified value.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   742
The idea is that simplification functions not only returns a regular expression,
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   743
but also a rectification function 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   744
\[
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   745
	\textit{simp}^{rect} : Regex \Rightarrow (Value \Rightarrow Value, Regex)
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   746
%\textit{frect} : Value \Rightarrow Value
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   747
\]
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   748
that is recorded recursively,
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   749
and then applied to the previous value 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   750
to obtain the correct value for $\inj$ to work on. 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   751
The recursive case of the lexer is defined as something like
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   752
\[
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   753
	\textit{slexer} \; r \; (c\!::\!s) \dn let \;(\textit{frect}, r_c) = \textit{simp}^{rect} \;(r \backslash c) \;\; \textit{in}\;\;
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   754
	\inj \; r \; c \; (\textit{frect} \; (\textit{slexer} \; r_c\; s))
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   755
	%\textit{match} \; s \; \textit{case} [
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   756
\]
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   757
However this approach (including $\textit{slexer}$'s correctness proof) only 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   758
works without bitcodes, and it limits the kind of simplifications one can introduce.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   759
%and they have not yet extended their relatively simple simplifications
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   760
%to more aggressive ones.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   761
See the thesis by Ausaf \cite{Ausaf}
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   762
for details.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   763
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   764
%\begin{center}
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   765
%	$\vdash v:  (r\backslash c) \implies \retrieve \; (\mathord{?}(\textit{bsimp} \; r_c)) \; v =\retrieve \; r  \;(\mathord{?} v_{r}^{c}) $
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   766
%\end{center}
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   767
%\noindent
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   768
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   769
We were not able to use their idea for our very strong simplification rules.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   770
Therefore we are taking another route that completely
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   771
disposes of lemma \ref{retrieveStepwise},
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   772
and prove a weakened inductive invariant instead.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   773
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   774
Let us first explain why lemma \ref{retrieveStepwise}'s requirement 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   775
is too strong, and suggest a few possible fixes, which leads to
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   776
our proof which we believe was the most natural and effective method.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   777
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   778
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   779
657
00171b627b8d Fixed some annotated/unannotated a/r notation inconsistencies.
Chengsong
parents: 656
diff changeset
   780
\subsection{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong}
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   781
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   782
%From this chapter we start with the main contribution of this thesis, which
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   783
658
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   784
The $\blexer$ proof relies on $r_i, \; v_i$ to match each other in lockstep
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   785
for each derivative step $i$, however only $v_0$ is needed and intermediate
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   786
$v_i$s are purely proof scaffolding.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   787
Moreover property \ref{eq:stepwise}
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   788
is stronger than needed for POSIX lexing: the precondition
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   789
$\vdash v_{i+1}:r_{i+1}$ is too general in the sense that it allows arbitrary 
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   790
values inhabiting in $r_i$ to retrieve bitcodes.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   791
%correspondence between the lexical value and the
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   792
%regular expression in derivative and injection operations at the same step $i$.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   793
%If we revisit the diagram \ref{graph:injZoom} with an example
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   794
Consider a concrete example where $a_i = (_{ZZ}x + _{ZS}y + _{S}x)$ and
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   795
$a_{i+1} = (_{ZZ}\ONE + \ZERO + _{S}\ONE)$.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   796
What is required in the proof of $\blexer$ is that for the POSIX value
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   797
$v_i = \Left  \; (\Left \; Empty)$,
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   798
the property
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   799
\[
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   800
	%\vdash \Left  \; (\Left \; Empty) : (\ONE+\ZERO+\ONE) \implies 
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   801
	\retrieve \; (_{ZZ}\ONE + \ZERO + _{S}\ONE) \; (\Left  \; (\Left \; \Empty) ) =
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   802
	\retrieve \; (_{ZZ}x + _{ZS}y + _{S}x ) \; (\Left  \; (\Left \; \Char\; x) )
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   803
\]
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   804
holds, and for $\blexersimp$ a property of similar shape to
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   805
\[
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   806
	\retrieve \; _{ZZ}\ONE \; \; Empty =
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   807
	\retrieve \; _{ZZ}x  \; (\Char\; x)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   808
\]
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   809
needs to hold as well.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   810
However for the definitely non-POSIX value $v_i' = \Right \; \Empty$
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   811
the precondition $\vdash \Right \; \Empty : x+y+x$ holds as well, and therefore
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   812
the following equality
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   813
\[
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   814
	\retrieve \; (_{ZZ}\ONE + \ZERO + _{S}\ONE) \;  (\Right \; \Empty) =
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   815
	\retrieve \; (_{ZZ}x + _{ZS}y + _{S}x ) \;  (\Right \; (\Char\; x))
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   816
\]
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   817
by lemma \ref{retrieveStepwise} holds for $\blexer$ as well.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   818
This cannot hold or be proven anymore with $\blexersimp$ as the corresponding
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   819
sub-regular expressions and values have been eliminated during the 
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   820
de-duplication procedure of our smplification.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   821
We are stuck with a property that holds in $\blexer$ but does not have a counterpart
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   822
in $\blexersimp$.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   823
This needs not hold for the purpose of POSIX lexing though--we know the rightmost 
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   824
subexpression $x$ is not POSIX by the left priority rule.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   825
The inductive invariant \ref{eq:stepwise} can be weakened by restricting the precondition
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   826
$\vdash v_i:r_i$ to $\exists s_i. \; (s_i, r_i) \rightarrow v_i$. 
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   827
We tried this route but it did not work well since we need to
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   828
use a similar technique as the rectification functions by Ausaf et al, 
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   829
and they can get very complicated with our simplifications.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   830
After some trial-and-error we found a property of the form
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   831
\begin{property}
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   832
	If a POSIX value can be extracted from $a \backslash s$, then
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   833
	it can be extracted from $a \backslash_{bsimps} s$ as well.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   834
\end{property}
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   835
\noindent
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   836
most natural to work with, and we defined a binary relation to capture
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   837
the connection between $a\backslash s$ and $a \backslash_{bsimps} s$.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   838
%and look specifically at
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   839
%the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   840
%the invariant that the same bitcodes can be extracted from the pairs:
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   841
%\tikzset{three sided/.style={
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   842
%        draw=none,
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   843
%        append after command={
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   844
%            [-,shorten <= -0.5\pgflinewidth]
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   845
%            ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   846
%        edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) 
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   847
%            ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   848
%        edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)            
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   849
%            ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   850
%        edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   851
%        }
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   852
%    }
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   853
%}
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   854
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   855
%\tikzset{three sided1/.style={
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   856
%        draw=none,
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   857
%        append after command={
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   858
%            [-,shorten <= -0.5\pgflinewidth]
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   859
%            ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   860
%        edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) 
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   861
%            ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   862
%        edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)            
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   863
%            ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   864
%        edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   865
%        }
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   866
%    }
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   867
%}
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   868
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   869
%\begin{center}
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   870
%	\begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   871
%		%\node [rectangle ] (1)  at (-7, 2) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   872
%		%\node [rectangle, draw] (2) at  (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   873
%		%\node [rectangle, draw] (3) at  (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   874
%		%\node [rectangle] (4) at  (9, 2) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   875
%		%\node [rectangle] (5) at  (-7, -2) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   876
%		%\node [rectangle, draw] (6) at  (-4, -2) {$v_i = \Stars \; [\Left (a)]$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   877
%		%\node [rectangle, draw] (7) at  ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   878
%		%\node [rectangle] (8) at  ( 9, -2) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   879
%		%\node [rectangle] (9) at  (-7, -6) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   880
%		%\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = bs' @ ZZS$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   881
%		%\node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   882
%		%\node [rectangle] (12) at  (9, -6) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   883
%		
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   884
%		\node [rectangle ] (1)  at (-8, 2) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   885
%		\node [rectangle, draw] (2) at  (-5, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   886
%		\node [rectangle, draw] (3) at  (3, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   887
%		\node [rectangle] (4) at  (8, 2) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   888
%		\node [rectangle] (5) at  (-8, -2) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   889
%		\node [rectangle, draw] (6) at  (-5, -2) {$v_i = \Stars \; [\Left (a)]$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   890
%		\node [rectangle, draw] (7) at  ( 3, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   891
%		\node [rectangle] (8) at  ( 8, -2) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   892
%		\node [rectangle] (9) at  (-8, -6) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   893
%		\node [rectangle, draw] (10) at (-5, -6) {$\textit{bits}_{i} = bs' @ ZZS$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   894
%		\node [rectangle, draw] (11) at (3, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   895
%		\node [rectangle] (12) at  (8, -6) {$\ldots$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   896
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   897
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   898
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   899
%		\path (1) edge [] node {} (2);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   900
%		\path (5) edge [] node {} (6);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   901
%		\path (9) edge [] node {} (10);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   902
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   903
%		\path (11) edge [] node {} (12);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   904
%		\path (7) edge [] node {} (8);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   905
%		\path (3) edge [] node {} (4);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   906
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   907
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   908
%		\path (6) edge [dashed,bend right = 30] node {$\retrieve \; r_i \; v_i$} (10);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   909
%		\path (2) edge [dashed,bend left = 48] node {} (10);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   910
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   911
%		\path (7) edge [dashed,bend right = 30] node {$\retrieve \; r_{i+1} \; v_{i+1}$} (11);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   912
%		\path (3) edge [dashed,bend left = 45] node {} (11);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   913
%	
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   914
%		\path (2) edge [] node {$\backslash a$} (3);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   915
%		\path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   916
%		\path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   917
%		%\path (6) edge [] node {$\vdash v_i : r_i$} (10);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   918
%		%\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   919
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   920
%		\path (10) edge [dashed, <->] node {$=$} (11);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   921
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   922
%		\path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   923
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   924
%%		\node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   925
%%		\node [rectangle, draw] (a) at (-6, 4)	  {$(aa)^*(_{Z}b + _{S}c)$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   926
%%		\path	(r)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   927
%%			edge [] node {$\internalise$} (a);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   928
%%		\node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   929
%%		\path	(a)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   930
%%			edge [] node {$\backslash a$} (a1);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   931
%%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   932
%%		\node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   933
%%		\node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   934
%%		\path	(a1)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   935
%%			edge [] node {$\backslash a$} (a21);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   936
%%		\node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   937
%%		\path	(a22)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   938
%%			edge [] node {$\backslash b$} (a3);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   939
%%		\path	(a21)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   940
%%			edge [dashed, bend right] node {} (a3);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   941
%%		\node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   942
%%		\path	(a3)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   943
%%			edge [below] node {$\bmkeps$} (bs);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   944
%%		\node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   945
%%		\path 	(bs)
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   946
%%			edge [] node {$\decode$} (v);
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   947
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   948
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   949
%	\end{tikzpicture}
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   950
%	%\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   951
%\end{center}
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   952
658
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   953
%When simplifications are added, the inhabitation relation no longer holds,
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   954
%causing the above diagram to break.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   955
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   956
%Ausaf addressed this with an augmented lexer he called $\textit{slexer}$.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   957
%
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   958
%
658
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   959
%
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   960
%we note that the invariant
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   961
%$\vdash v_{i+1}: r_{i+1} \implies \retrieve \; r_{i+1} \; v_{i+1} $ is too strong
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   962
%to maintain because the precondition $\vdash v_i : r_i$ is too weak.
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   963
%It does not require $v_i$ to be a POSIX value 
656
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   964
%
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   965
%
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   966
%which is essential for getting an understanding this thesis
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   967
%in chapter \ref{Bitcoded1}, which is necessary for understanding why
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   968
%the proof 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   969
%
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   970
%In this chapter,
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   971
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   972
%We contrast our simplification function 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   973
%with Sulzmann and Lu's, indicating the simplicity of our algorithm.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   974
%This is another case for the usefulness 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   975
%and reliability of formal proofs on algorithms.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   976
%These ``aggressive'' simplifications would not be possible in the injection-based 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   977
%lexing we introduced in chapter \ref{Inj}.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   978
%We then prove the correctness with the improved version of 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   979
%$\blexer$, called $\blexersimp$, by establishing 
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   980
%$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system.
753a3b0ee02b reordered sections to make chapter 4 more coherent
Chengsong
parents: 655
diff changeset
   981
%
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   982
%----------------------------------------------------------------------------------------
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   983
%	SECTION rewrite relation
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   984
%----------------------------------------------------------------------------------------
658
273c176d9027 finished 4.3.2 section explaining why lemma 11 is too strong
Chengsong
parents: 657
diff changeset
   985
In the next section we first introduce the rewriting relation \emph{rrewrite}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   986
($\rrewrite$) between two regular expressions,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   987
which stands for an atomic
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   988
simplification.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   989
We then prove properties about
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   990
this rewriting relation and its reflexive transitive closure.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   991
Finally we leverage these properties to show
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   992
an equivalence between the results generated by
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   993
$\blexer$ and $\blexersimp$.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   994
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   995
\subsection{The Rewriting Relation $\rrewrite$($\rightsquigarrow$)}
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
   996
In the $\blexer$'s correctness proof, we
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   997
did not directly derive the fact that $\blexer$ generates the POSIX value,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   998
but first proved that $\blexer$ generates the same result as $\lexer$.
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
   999
Then we re-use
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1000
the correctness of $\lexer$
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1001
to obtain 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1002
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1003
	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer \; r \;s = v$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1004
	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer\;
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1005
	r\;s = \None$.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1006
\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1007
%\begin{center}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1008
%	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer \; r \;s = v$.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1009
%\end{center}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1010
Here we apply this
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1011
modularised technique again
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1012
by first proving that
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1013
$\blexersimp \; r \; s $ 
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1014
produces the same output as $\blexer \; r\; s$,
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1015
and then piecing it together with 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1016
$\blexer$'s correctness to achieve our main
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1017
theorem:
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1018
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1019
	$(r, s) \rightarrow v \; \;   \textit{iff} \;\;  \blexersimp \; r \; s = \Some \;v$
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1020
	\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1021
	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp\;
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1022
	r\;s = \None$
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1023
\end{center}
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1024
\noindent
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1025
The overall idea for the proof
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1026
of $\blexer \;r \;s = \blexersimp \; r \;s$ 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1027
is that the transition from $r$ to $\textit{bsimp}\; r$ can be
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1028
broken down into smaller rewrite steps of the form:
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1029
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1030
	$r \rightsquigarrow^* \textit{bsimp} \; r$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1031
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1032
where each rewrite step, written $\rightsquigarrow$,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1033
is an ``atomic'' simplification that
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1034
is similar to a small-step reduction in operational semantics (
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1035
see figure \ref{rrewriteRules} for the rules):
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1036
\begin{figure}[H]
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1037
\begin{mathpar}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1038
	\inferrule * [Right = $S\ZERO_l$]{\vspace{0em}}{_{bs} \ZERO \cdot r_2 \rightsquigarrow \ZERO\\}
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
  1039
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1040
	\inferrule * [Right = $S\ZERO_r$]{\vspace{0em}}{_{bs} r_1 \cdot \ZERO \rightsquigarrow \ZERO\\}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1041
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1042
	\inferrule * [Right = $S_1$]{\vspace{0em}}{_{bs1} ((_{bs2} \ONE) \cdot r) \rightsquigarrow \fuse \; (bs_1 @ bs_2) \; r\\}\\
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1043
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1044
	
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1045
	
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1046
	\inferrule * [Right = $SL$] {\\ r_1 \rightsquigarrow r_2}{_{bs} r_1 \cdot r_3 \rightsquigarrow _{bs} r_2 \cdot r_3\\}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1047
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1048
	\inferrule * [Right = $SR$] {\\ r_3 \rightsquigarrow r_4}{_{bs} r_1 \cdot r_3 \rightsquigarrow _{bs} r_1 \cdot r_4\\}\\
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1049
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1050
	\inferrule * [Right = $A0$] {\vspace{0em}}{ _{bs}\sum [] \rightsquigarrow \ZERO}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1051
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1052
	\inferrule * [Right = $A1$] {\vspace{0em}}{ _{bs}\sum [a] \rightsquigarrow \fuse \; bs \; a}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1053
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1054
	\inferrule * [Right = $AL$] {\\ rs_1 \stackrel{s}{\rightsquigarrow} rs_2}{_{bs}\sum rs_1 \rightsquigarrow rs_2}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1055
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1056
	\inferrule * [Right = $LE$] {\vspace{0em}}{ [] \stackrel{s}{\rightsquigarrow} []}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1057
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1058
	\inferrule * [Right = $LT$] {rs_1 \stackrel{s}{\rightsquigarrow} rs_2}{ r :: rs_1 \stackrel{s}{\rightsquigarrow} r :: rs_2 }
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1059
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1060
	\inferrule * [Right = $LH$] {r_1 \rightsquigarrow r_2}{ r_1 :: rs \stackrel{s}{\rightsquigarrow} r_2 :: rs}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1061
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1062
	\inferrule * [Right = $L\ZERO$] {\vspace{0em}}{\ZERO :: rs \stackrel{s}{\rightsquigarrow} rs}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1063
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1064
	\inferrule * [Right = $LS$] {\vspace{0em}}{_{bs} \sum (rs_1 :: rs_b) \stackrel{s}{\rightsquigarrow} ((\map \; (\fuse \; bs_1) \; rs_1) @ rsb) }
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1065
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
  1066
	\inferrule * [Right = $LD$] {\\ \rerase{a_1} = \rerase{a_2}}{rs_a @ [a_1] @ rs_b @ [a_2] @ rs_c \stackrel{s}{\rightsquigarrow} rs_a @ [a_1] @ rs_b @ rs_c}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1067
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1068
\end{mathpar}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1069
\caption{
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1070
The rewrite rules that generate simplified regular expressions 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1071
in small steps: $r_1 \rightsquigarrow r_2$ is for bitcoded regular expressions 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1072
and $rs_1 \stackrel{s}{\rightsquigarrow} rs_2$ for 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1073
lists of bitcoded regular expressions. 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1074
Interesting is the LD rule that allows copies of regular expressions 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1075
to be removed provided a regular expression 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1076
earlier in the list can match the same strings.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1077
}\label{rrewriteRules}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1078
\end{figure}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1079
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1080
The rules $LT$ and $LH$ are for rewriting two regular expression lists
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1081
such that one regular expression
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1082
in the left-hand-side list is rewritable in one step
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1083
to the right-hand side's regular expression at the same position.
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1084
This helps with defining the ``context rule'' $AL$.
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1085
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1086
The reflexive transitive closure of $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1087
are defined in the usual way:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1088
\begin{figure}[H]
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1089
	\centering
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1090
\begin{mathpar}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1091
	\inferrule{\vspace{0em}}{ r \rightsquigarrow^* r \\}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1092
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1093
	\inferrule{\vspace{0em}}{rs \stackrel{s*}{\rightsquigarrow} rs \\}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1094
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1095
	\inferrule{r_1 \rightsquigarrow^*  r_2 \land \; r_2 \rightsquigarrow^* r_3}{r_1 \rightsquigarrow^* r_3\\}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1096
	
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1097
	\inferrule{rs_1 \stackrel{s*}{\rightsquigarrow}  rs_2 \land \; rs_2 \stackrel{s*}{\rightsquigarrow} rs_3}{rs_1 \stackrel{s*}{\rightsquigarrow} rs_3}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1098
\end{mathpar}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1099
\caption{The Reflexive Transitive Closure of 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1100
$\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$}\label{transClosure}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1101
\end{figure}
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1102
%Two rewritable terms will remain rewritable to each other
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1103
%even after a derivative is taken:
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1104
The main point of our rewriting relation
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1105
is that it is preserved under derivatives,
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1106
namely
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1107
\begin{center}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1108
	$r_1 \rightsquigarrow r_2 \implies (r_1 \backslash c) \rightsquigarrow^* (r_2 \backslash c)$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1109
\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1110
And also, if two terms are rewritable to each other,
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1111
then they produce the same bitcodes:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1112
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1113
	$r \rightsquigarrow^* r' \;\; \textit{then} \; \; \bmkeps \; r = \bmkeps \; r'$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1114
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1115
The decoding phase of both $\blexer$ and $\blexersimp$
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1116
are the same, which means that if they receive the same
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1117
bitcodes before the decoding phase,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1118
they generate the same value after decoding is done.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1119
We will prove the three properties 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1120
we mentioned above in the next sub-section.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1121
\subsection{Important Properties of $\rightsquigarrow$}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1122
First we prove some basic facts 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1123
about $\rightsquigarrow$, $\stackrel{s}{\rightsquigarrow}$, 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1124
$\rightsquigarrow^*$ and $\stackrel{s*}{\rightsquigarrow}$,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1125
which will be needed later.\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1126
The inference rules (\ref{rrewriteRules}) we 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1127
gave in the previous section 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1128
have their ``many-steps version'':
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1129
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1130
\begin{lemma}\label{squig1}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1131
	\hspace{0em}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1132
	\begin{itemize}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1133
		\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1134
			$rs_1 \stackrel{s*}{\rightsquigarrow} rs_2 \implies _{bs} \sum rs_1 \stackrel{*}{\rightsquigarrow} _{bs} \sum rs_2$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1135
		\item
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1136
			$r \rightsquigarrow^* r' \implies _{bs} \sum (r :: rs)\; \rightsquigarrow^*\;  _{bs} \sum (r' :: rs)$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1137
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1138
		\item
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1139
			The rewriting in many steps property is composable 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1140
			in terms of the sequence constructor:\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1141
			$r_1 \rightsquigarrow^* r_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1142
			\implies _{bs} r_1 \cdot r_3 \rightsquigarrow^* \;  
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1143
			_{bs} r_2 \cdot r_3 \quad $ 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1144
			and 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1145
			$\quad r_3 \rightsquigarrow^* r_4 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1146
			\implies _{bs} r_1 \cdot r_3 \rightsquigarrow^* _{bs} \; r_1 \cdot r_4$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1147
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1148
			The rewriting in many steps properties 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1149
			$\stackrel{*}{\rightsquigarrow}$ and $\stackrel{s*}{\rightsquigarrow}$ 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1150
			is preserved under the function $\fuse$:\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1151
				$r_1 \rightsquigarrow^* r_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1152
				\implies \fuse \; bs \; r_1 \rightsquigarrow^* \; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1153
				\fuse \; bs \; r_2 \quad  $ and 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1154
				$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1155
				\implies \map \; (\fuse \; bs) \; rs_1 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1156
				\stackrel{s*}{\rightsquigarrow} \map \; (\fuse \; bs) \; rs_2$
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1157
	\end{itemize}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1158
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1159
\begin{proof}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1160
	By an induction on 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1161
	the inductive cases of $\stackrel{s*}{\rightsquigarrow}$ and $\rightsquigarrow^*$ respectively.
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1162
	The third and fourth points are 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1163
	by the properties $r_1 \rightsquigarrow r_2 \implies \fuse \; bs \; r_1 \implies \fuse \; bs \; r_2$ and
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1164
	$rs_2 \stackrel{s}{\rightsquigarrow} rs_3 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1165
	\implies \map \; (\fuse \; bs) rs_2 \stackrel{s*}{\rightsquigarrow} \map \; (\fuse \; bs)\; rs_3$,
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1166
	which can be inductively proven by the inductive cases of $\rightsquigarrow$ and 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1167
	$\stackrel{s}{\rightsquigarrow}$.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1168
\end{proof}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1169
\noindent
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1170
The inference rules of $\stackrel{s}{\rightsquigarrow}$
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1171
are defined in terms of the list cons operation, where
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1172
we establish that the 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1173
$\stackrel{s}{\rightsquigarrow}$ and $\stackrel{s*}{\rightsquigarrow}$ 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1174
relation is also preserved w.r.t appending and prepending of a list.
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1175
In addition, we
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1176
also prove some relations 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1177
between $\rightsquigarrow^*$ and $\stackrel{s*}{\rightsquigarrow}$.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1178
\begin{lemma}\label{ssgqTossgs}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1179
	\hspace{0em}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1180
	\begin{itemize}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1181
		\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1182
			$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies rs @ rs_1 \stackrel{s}{\rightsquigarrow} rs @ rs_2$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1183
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1184
		\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1185
			$rs_1 \stackrel{s*}{\rightsquigarrow} rs_2 \implies 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1186
			rs @ rs_1 \stackrel{s*}{\rightsquigarrow} rs @ rs_2 \; \;
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1187
			\textit{and} \; \;
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1188
			rs_1 @ rs \stackrel{s*}{\rightsquigarrow} rs_2 @ rs$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1189
			
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1190
		\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1191
			The $\stackrel{s}{\rightsquigarrow} $ relation after appending 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1192
			a list becomes $\stackrel{s*}{\rightsquigarrow}$:\\
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1193
			$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1194
			\implies rs_1 @ rs \stackrel{s*}{\rightsquigarrow} rs_2 @ rs$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1195
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1196
		
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1197
			$r_1 \rightsquigarrow^* r_2 \implies [r_1] \stackrel{s*}{\rightsquigarrow} [r_2]$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1198
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1199
		
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1200
			$rs_3 \stackrel{s*}{\rightsquigarrow} rs_4 \land r_1 \rightsquigarrow^* r_2 \implies
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1201
			r_2 :: rs_3 \stackrel{s*}{\rightsquigarrow} r_2 :: rs_4$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1202
		\item			
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1203
			If we can rewrite a regular expression 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1204
			in many steps to $\ZERO$, then 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1205
			we can also rewrite any sequence containing it to $\ZERO$:\\
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1206
			$r_1 \rightsquigarrow^* \ZERO 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1207
			\implies _{bs}r_1\cdot r_2 \rightsquigarrow^* \ZERO$
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1208
	\end{itemize}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1209
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1210
\begin{proof}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1211
	The first part is by induction on the list $rs$.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1212
	The second part is by induction on the inductive cases of $\stackrel{s*}{\rightsquigarrow}$.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1213
	The third part is 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1214
	by rule induction of $\stackrel{s}{\rightsquigarrow}$.
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1215
	The fourth sub-lemma is 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1216
	by rule induction of 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1217
	$\stackrel{s*}{\rightsquigarrow}$ and using part one to three. 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1218
	The fifth part is a corollary of part four.
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1219
	The last part is proven by rule induction again on $\rightsquigarrow^*$.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1220
\end{proof}
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1221
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1222
Now we are ready to give the proofs of the following properties:
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1223
\begin{itemize}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1224
	\item
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1225
		$r \rightsquigarrow^* r'\land \bnullable \; r_1 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1226
		\implies \bmkeps \; r = \bmkeps \; r'$. \\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1227
	\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1228
		$r \rightsquigarrow^* \textit{bsimp} \;r$.\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1229
	\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1230
		$r \rightsquigarrow r' \implies r \backslash c \rightsquigarrow^* r'\backslash c$.\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1231
\end{itemize}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1232
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1233
\subsubsection{Property 1: $r \rightsquigarrow^* r'\land \bnullable \; r_1 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1234
		\implies \bmkeps \; r = \bmkeps \; r'$}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1235
Intuitively, this property says we can 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1236
extract the same bitcodes using $\bmkeps$ from the nullable
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1237
components of two regular expressions $r$ and $r'$,
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1238
if we can rewrite from one to the other in finitely
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1239
many steps.
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1240
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1241
For convenience, 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1242
we define a predicate for a list of regular expressions
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1243
having at least one nullable regular expression:
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1244
\begin{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1245
	$\textit{bnullables} \; rs \quad \dn \quad \exists r \in rs. \;\; \bnullable \; r$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1246
\end{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1247
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1248
The rewriting relation $\rightsquigarrow$ preserves (b)nullability:
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1249
\begin{lemma}\label{rewritesBnullable}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1250
	\hspace{0em}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1251
	\begin{itemize}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1252
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1253
			$\text{If} \; r_1 \rightsquigarrow r_2, \; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1254
			\text{then} \; \bnullable \; r_1 = \bnullable \; r_2$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1255
		\item 	
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1256
			$\text{If} \; rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \;
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1257
			\text{then} \; \textit{bnullables} \; rs_1 = \textit{bnullables} \; rs_2$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1258
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1259
			$r_1 \rightsquigarrow^* r_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1260
			\implies \bnullable \; r_1 = \bnullable \; r_2$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1261
	\end{itemize}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1262
\end{lemma}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1263
\begin{proof}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1264
	By rule induction of $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$.
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1265
	The third point is a result of the second.
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1266
\end{proof}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1267
\noindent
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1268
For convenience again,
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1269
we define $\bmkepss$ on a list $rs$,
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1270
which extracts the bit-codes on the first $\bnullable$ element in $rs$:
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1271
\begin{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1272
	\begin{tabular}{lcl}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1273
		$\bmkepss \; [] $ & $\dn$ & $[]$\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1274
		$\bmkepss \; r :: rs$ & $\dn$ & $\textit{if} \;(\bnullable \; r) \;\; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1275
		\textit{then} \;\; \bmkeps \; r \; \textit{else} \;\; \bmkepss \; rs$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1276
	\end{tabular}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1277
\end{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1278
\noindent
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1279
If both regular expressions in a rewriting relation are nullable, then they 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1280
produce the same bitcodes:
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1281
\begin{lemma}\label{rewriteBmkepsAux}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1282
	\hspace{0em}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1283
	\begin{itemize}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1284
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1285
			$r_1 \rightsquigarrow r_2 \implies 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1286
			(\bnullable \; r_1 \land \bnullable \; r_2 \implies \bmkeps \; r_1 = 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1287
			\bmkeps \; r_2)$ 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1288
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1289
			and
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1290
			$rs_ 1 \stackrel{s}{\rightsquigarrow} rs_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1291
			\implies (\bnullables \; rs_1 \land \bnullables \; rs_2 \implies 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1292
			\bmkepss \; rs_1 = \bmkepss \; rs2)$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1293
	\end{itemize}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1294
\end{lemma}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1295
\begin{proof}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1296
	By rule induction over the cases that lead to $r_1 \rightsquigarrow r_2$.
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1297
\end{proof}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1298
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1299
With lemma \ref{rewriteBmkepsAux} in place we are ready to prove its
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1300
many-step version: 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1301
\begin{lemma}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1302
	$\text{If} \;\; r \stackrel{*}{\rightsquigarrow} r' \;\; \text{and} \;\; \bnullable \; r, \;\;\; \text{then} \;\; \bmkeps \; r = \bmkeps \; r'$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1303
\end{lemma}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1304
\begin{proof}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1305
	By rule induction of $\stackrel{*}{\rightsquigarrow} $. Lemma 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1306
	$\ref{rewritesBnullable}$ gives us both $r$ and $r'$ are nullable.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1307
	The lemma \ref{rewriteBmkepsAux} solves the inductive case.
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1308
\end{proof}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1309
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1310
\subsubsection{Property 2: $r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1311
Now we get to the key part of the proof, 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1312
which says that our simplification's helper functions 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1313
such as $\distinctBy$ and $\flts$ describe
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1314
reducts of $\stackrel{s*}{\rightsquigarrow}$ and 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1315
$\rightsquigarrow^* $.
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1316
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1317
The first lemma to prove is a more general version of 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1318
$rs_ 1 \rightsquigarrow^* \distinctBy \; rs_1 \; \phi$:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1319
\begin{lemma}
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1320
	$rs_1 @ rs_2 \stackrel{s*}{\rightsquigarrow} 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1321
	(rs_1 @ (\distinctBy \; rs_2 \; \; \rerases \;\; (\map\;\; \rerases \; \; rs_1)))$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1322
\end{lemma}
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1323
\noindent
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1324
It says that for a list made of two parts $rs_1 @ rs_2$, 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1325
one can throw away the duplicate
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1326
elements in $rs_2$, as well as those that have appeared in $rs_1$.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1327
\begin{proof}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1328
	By induction on $rs_2$, where $rs_1$ is allowed to be arbitrary.
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1329
\end{proof}
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1330
\noindent
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1331
Setting $rs_2$ to be empty,
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1332
we get the corollary
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1333
\begin{corollary}\label{dBPreserves}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1334
	$rs_1 \stackrel{s*}{\rightsquigarrow} \distinctBy \; rs_1 \; \phi$.
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1335
\end{corollary}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1336
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1337
Similarly the flatten function $\flts$ describes a reduct of
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1338
$\stackrel{s*}{\rightsquigarrow}$ as well:
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
  1339
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1340
\begin{lemma}\label{fltsPreserves}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1341
	$rs \stackrel{s*}{\rightsquigarrow} \flts \; rs$
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1342
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1343
\begin{proof}
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1344
	By an induction on $rs$.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1345
\end{proof}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1346
\noindent
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1347
The function $\bsimpalts$ preserves rewritability:
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1348
\begin{lemma}\label{bsimpaltsPreserves}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1349
	$_{bs} \sum rs \stackrel{*}{\rightsquigarrow} \bsimpalts \; _{bs} \; rs$
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1350
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1351
\noindent
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1352
The simplification function
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1353
$\textit{bsimp}$ only transforms the regular expression  using steps specified by 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1354
$\rightsquigarrow^*$ and nothing else:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1355
\begin{lemma}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1356
	$r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1357
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1358
\begin{proof}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1359
	By an induction on $r$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1360
	The most involved case is the alternative, 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1361
	where we use lemmas \ref{bsimpaltsPreserves},
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1362
	\ref{fltsPreserves} and \ref{dBPreserves} to do a series of rewriting:\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1363
	\begin{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1364
		\begin{tabular}{lcl}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1365
			$rs$ &  $\stackrel{s*}{\rightsquigarrow}$ & $ \map \; \textit{bsimp} \; rs$\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1366
			     &  $\stackrel{s*}{\rightsquigarrow}$ & $ \flts \; (\map \; \textit{bsimp} \; rs)$\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1367
			     &  $\stackrel{s*}{\rightsquigarrow}$ & $ \distinctBy \; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1368
			(\flts \; (\map \; \textit{bsimp}\; rs)) \; \rerases \; \phi$\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1369
		\end{tabular}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1370
	\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1371
	Using this we can derive the following rewrite sequence:\\
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1372
	\begin{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1373
		\begin{tabular}{lcl}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1374
			$r$ & $=$ & $_{bs}\sum rs$\\[1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1375
			    & $\rightsquigarrow^*$ & $\bsimpalts \; bs \; rs$ \\[1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1376
			    & $\rightsquigarrow^*$ & $\ldots$ \\ [1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1377
			    & $\rightsquigarrow^*$ & $\bsimpalts \; bs \; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1378
			    (\distinctBy \; (\flts \; (\map \; \textit{bsimp}\; rs)) 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1379
			    \; \rerases \; \phi)$\\[1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1380
			    %& $\rightsquigarrow^*$ & $ _{bs} \sum (\distinctBy \; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1381
				%(\flts \; (\map \; \textit{bsimp}\; rs)) \; \;
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1382
				%\rerases \; \;\phi) $\\[1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1383
			    & $\rightsquigarrow^*$ & $\textit{bsimp} \; r$\\[1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1384
		\end{tabular}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1385
	\end{center}	
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1386
\end{proof}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1387
\subsubsection{Property 3: $r_1 \stackrel{*}{\rightsquigarrow}  r_2 \implies r_1 \backslash c \stackrel{*}{\rightsquigarrow} r_2 \backslash c$}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1388
The rewrite relation 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1389
$\rightsquigarrow$ changes into $\stackrel{*}{\rightsquigarrow}$
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1390
after derivatives are taken on both sides:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1391
\begin{lemma}\label{rewriteBder}
588
Chengsong
parents: 586
diff changeset
  1392
	\hspace{0em}
Chengsong
parents: 586
diff changeset
  1393
	\begin{itemize}
Chengsong
parents: 586
diff changeset
  1394
		\item
Chengsong
parents: 586
diff changeset
  1395
			If $r_1 \rightsquigarrow r_2$, then $r_1 \backslash c 
Chengsong
parents: 586
diff changeset
  1396
			\rightsquigarrow^*  r_2 \backslash c$ 
Chengsong
parents: 586
diff changeset
  1397
		\item	
Chengsong
parents: 586
diff changeset
  1398
			If $rs_1 \stackrel{s}{\rightsquigarrow} rs_2$, then $ 
Chengsong
parents: 586
diff changeset
  1399
			\map \; (\_\backslash c) \; rs_1 
Chengsong
parents: 586
diff changeset
  1400
			\stackrel{s*}{\rightsquigarrow} \map \; (\_ \backslash c) \; rs_2$
Chengsong
parents: 586
diff changeset
  1401
	\end{itemize}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1402
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1403
\begin{proof}
588
Chengsong
parents: 586
diff changeset
  1404
	By induction on $\rightsquigarrow$ 
Chengsong
parents: 586
diff changeset
  1405
	and $\stackrel{s}{\rightsquigarrow}$, using a number of the previous lemmas.
Chengsong
parents: 586
diff changeset
  1406
\end{proof}
Chengsong
parents: 586
diff changeset
  1407
\noindent
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1408
Now we can prove property 3 as an immediate corollary:
588
Chengsong
parents: 586
diff changeset
  1409
\begin{corollary}\label{rewritesBder}
Chengsong
parents: 586
diff changeset
  1410
	$r_1 \rightsquigarrow^* r_2 \implies r_1 \backslash c \rightsquigarrow^*   
Chengsong
parents: 586
diff changeset
  1411
	r_2 \backslash c$
Chengsong
parents: 586
diff changeset
  1412
\end{corollary}
Chengsong
parents: 586
diff changeset
  1413
\begin{proof}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1414
	By rule induction of $\stackrel{*}{\rightsquigarrow} $ and   lemma \ref{rewriteBder}.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1415
\end{proof}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1416
\noindent
588
Chengsong
parents: 586
diff changeset
  1417
This can be extended and combined with $r \rightsquigarrow^* \textit{bsimp} \; r$
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1418
to obtain the correspondence between
588
Chengsong
parents: 586
diff changeset
  1419
$\blexer$ and $\blexersimp$'s intermediate
Chengsong
parents: 586
diff changeset
  1420
derivative regular expressions 
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1421
\begin{lemma}\label{bderBderssimp}
588
Chengsong
parents: 586
diff changeset
  1422
	$a \backslash s \rightsquigarrow^* \bderssimp{a}{s} $
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1423
\end{lemma}
588
Chengsong
parents: 586
diff changeset
  1424
\begin{proof}
Chengsong
parents: 586
diff changeset
  1425
	By an induction on $s$.
Chengsong
parents: 586
diff changeset
  1426
\end{proof}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1427
\subsection{Main Theorem}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1428
Now with \ref{bderBderssimp} in place we are ready for the main theorem.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1429
\begin{theorem}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1430
	$\blexer \; r \; s = \blexersimp{r}{s}$
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1431
\end{theorem}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1432
\noindent
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1433
\begin{proof}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1434
	We can rewrite in many steps from the original lexer's 
588
Chengsong
parents: 586
diff changeset
  1435
	derivative regular expressions to the 
Chengsong
parents: 586
diff changeset
  1436
	lexer with simplification applied (by lemma \ref{bderBderssimp}):
Chengsong
parents: 586
diff changeset
  1437
	\begin{center}
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1438
		$a \backslash s \rightsquigarrow^* \bderssimp{a}{s} $.
588
Chengsong
parents: 586
diff changeset
  1439
	\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1440
	We know that they generate the same bits, if the lexing result is a match:
588
Chengsong
parents: 586
diff changeset
  1441
	\begin{center}
Chengsong
parents: 586
diff changeset
  1442
		$\bnullable \; (a \backslash s) 
Chengsong
parents: 586
diff changeset
  1443
		\implies \bmkeps \; (a \backslash s) = \bmkeps \; (\bderssimp{a}{s})$
Chengsong
parents: 586
diff changeset
  1444
	\end{center}
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1445
	Now that they generate the same bits, we know they also give the same value after decoding.
588
Chengsong
parents: 586
diff changeset
  1446
	\begin{center}
Chengsong
parents: 586
diff changeset
  1447
		$\bnullable \; (a \backslash s) 
Chengsong
parents: 586
diff changeset
  1448
		\implies \decode \; r \; (\bmkeps \; (a \backslash s)) = 
Chengsong
parents: 586
diff changeset
  1449
		\decode \; r \; (\bmkeps \; (\bderssimp{a}{s}))$
Chengsong
parents: 586
diff changeset
  1450
	\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1451
	Which is required by our proof goal:
588
Chengsong
parents: 586
diff changeset
  1452
	\begin{center}
Chengsong
parents: 586
diff changeset
  1453
		$\blexer \; r \; s = \blexersimp \; r \; s$.
Chengsong
parents: 586
diff changeset
  1454
	\end{center}	
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1455
\end{proof}
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1456
\noindent
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1457
As a corollary,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1458
we can link this result with the lemma we proved earlier that 
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1459
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1460
	$(r, s) \rightarrow v \;\; \textit{iff}\;\; \blexer \; r \; s = \Some \;v$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1461
	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer\;
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1462
	r\;s = \None$.
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1463
\end{center}
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1464
and obtain the property that the bit-coded lexer with simplification is
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1465
indeed correctly generating a POSIX lexing result, if such a result exists.
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1466
\begin{corollary}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1467
	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp \; r\; s = \Some \; v$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1468
	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp\;
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1469
	r\;s = \None$.
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1470
\end{corollary}
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1471
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1472
\subsection{Comments on the Proof Techniques Used}
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1473
Straightforward as the proof may seem,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1474
the efforts we spent obtaining it were far from trivial.
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1475
We initially attempted to re-use the argument 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1476
in \cref{flex_retrieve}. 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1477
The problem is that both functions $\inj$ and $\retrieve$ require 
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1478
that the annotated regular expressions stay unsimplified, 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1479
so that one can 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1480
correctly compare $v_{i+1}$ and $r_i$  and $v_i$ 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1481
in diagram \ref{graph:inj}.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1482
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1483
We also tried to prove 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1484
\begin{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1485
$\textit{bsimp} \;\; (\bderssimp{a}{s}) = 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1486
\textit{bsimp} \;\;  (a\backslash s)$,
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1487
\end{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1488
but this turns out to be not true.
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1489
A counterexample is
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1490
\[ a = [(_{Z}1+_{S}c)\cdot [bb \cdot (_{Z}1+_{S}c)]] \;\; 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1491
	\text{and} \;\; s = bb.
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1492
\]
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1493
\noindent
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1494
Then we would have 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1495
\begin{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1496
	$\textit{bsimp}\;\; ( a \backslash s )$ =
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1497
	$_{[]}(_{ZZ}\ONE +  _{ZS}c ) $
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1498
\end{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1499
\noindent
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1500
whereas 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1501
\begin{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1502
	$\textit{bsimp} \;\;( \bderssimp{a}{s} )$ =  
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1503
	$_{Z}(_{Z} \ONE + _{S} c)$.
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1504
\end{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1505
Unfortunately, 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1506
if we apply $\textit{bsimp}$ differently
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1507
we will always have this discrepancy. 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1508
This is due to 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1509
the $\map \; (\fuse\; bs) \; as$ operation 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1510
happening at different locations in the regular expression.
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1511
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1512
The rewriting relation 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1513
$\rightsquigarrow^*$ 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1514
allows us to ignore this discrepancy
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1515
and view the expressions 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1516
\begin{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1517
	$_{[]}(_{ZZ}\ONE +  _{ZS}c ) $\\
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1518
	and\\
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1519
	$_{Z}(_{Z} \ONE + _{S} c)$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1520
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1521
\end{center}
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1522
as equal because they were both re-written
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1523
from the same expression.
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1524
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1525
The simplification rewriting rules
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1526
given in \ref{rrewriteRules} are by no means
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1527
final,
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1528
one could come up with new rules
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1529
such as 
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1530
$\SEQ r_1 \cdot (\SEQ r_1 \cdot r_3) \rightarrow
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1531
\SEQs [r_1, r_2, r_3]$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1532
However this does not fit with the proof technique
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1533
of our main theorem, but seem to not violate the POSIX
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1534
property.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1535
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1536
Having established the correctness of our
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1537
$\blexersimp$, in the next chapter we shall prove that with our $\simp$ function,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1538
for a given $r$, the derivative size is always
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1539
finitely bounded by a constant.