ChengsongTanPhdThesis/Chapters/Bitcoded2.tex
author Chengsong
Sat, 08 Jul 2023 01:36:08 +0100
changeset 655 d8f82c690b32
parent 654 2ad20ba5b178
child 656 753a3b0ee02b
permissions -rwxr-xr-x
updated 4.2 diagram
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     1
% Chapter Template
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     2
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     3
% Main chapter title
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     4
\chapter{Correctness of Bit-coded Algorithm with Simplification}
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     5
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     6
\label{Bitcoded2} % Change X to a consecutive number; for referencing this chapter elsewhere, use \ref{ChapterX}
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     7
%Then we illustrate how the algorithm without bitcodes falls short for such aggressive 
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     8
%simplifications and therefore introduce our version of the bitcoded algorithm and 
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     9
%its correctness proof in 
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
    10
%Chapter 3\ref{Chapter3}. 
649
Chengsong
parents: 640
diff changeset
    11
\section{Overview}
655
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
    12
\marginpar{\em Added a completely new \\overview section, \\highlighting contributions.}
649
Chengsong
parents: 640
diff changeset
    13
Chengsong
parents: 640
diff changeset
    14
This chapter
Chengsong
parents: 640
diff changeset
    15
is the point from which novel contributions of this PhD project are introduced
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    16
in detail. 
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    17
The material in the
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    18
previous
654
Chengsong
parents: 653
diff changeset
    19
chapters is necessary for this thesis,
Chengsong
parents: 653
diff changeset
    20
because it provides the context for why we need a new framework for
Chengsong
parents: 653
diff changeset
    21
the proof of $\blexersimp$.
Chengsong
parents: 653
diff changeset
    22
%material for setting the scene of the formal proof we
Chengsong
parents: 653
diff changeset
    23
%are about to describe.
Chengsong
parents: 653
diff changeset
    24
The fundamental reason is we cannot extend the correctness proof of theorem 4
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    25
because lemma 13 does not hold anymore when simplifications are involved.
655
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
    26
\marginpar{\em rephrased things \\so why new \\proof makes sense.}
654
Chengsong
parents: 653
diff changeset
    27
%The proof details are necessary materials for this thesis
Chengsong
parents: 653
diff changeset
    28
%because it provides necessary context to explain why we need a
Chengsong
parents: 653
diff changeset
    29
%new framework for the proof of $\blexersimp$, which involves
Chengsong
parents: 653
diff changeset
    30
%simplifications that cause structural changes to the regular expression.
Chengsong
parents: 653
diff changeset
    31
%A new formal proof of the correctness of $\blexersimp$, where the 
Chengsong
parents: 653
diff changeset
    32
%proof of $\blexer$
Chengsong
parents: 653
diff changeset
    33
%is not applicatble in the sense that we cannot straightforwardly extend the
Chengsong
parents: 653
diff changeset
    34
%proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
Chengsong
parents: 653
diff changeset
    35
%not hold anymore.
650
Chengsong
parents: 649
diff changeset
    36
%This is because the structural induction on the stepwise correctness
Chengsong
parents: 649
diff changeset
    37
%of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described
Chengsong
parents: 649
diff changeset
    38
%in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to
Chengsong
parents: 649
diff changeset
    39
%each other.
Chengsong
parents: 649
diff changeset
    40
%In this chapter we introduce simplifications
Chengsong
parents: 649
diff changeset
    41
%for annotated regular expressions that can be applied to 
Chengsong
parents: 649
diff changeset
    42
%each intermediate derivative result. This allows
Chengsong
parents: 649
diff changeset
    43
%us to make $\blexer$ much more efficient.
Chengsong
parents: 649
diff changeset
    44
%Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
Chengsong
parents: 649
diff changeset
    45
%but their simplification functions could have been more efficient and in some cases needed fixing.
Chengsong
parents: 649
diff changeset
    46
Chengsong
parents: 649
diff changeset
    47
649
Chengsong
parents: 640
diff changeset
    48
In particular, the correctness theorem 
Chengsong
parents: 640
diff changeset
    49
of the un-optimised bit-coded lexer $\blexer$ in 
Chengsong
parents: 640
diff changeset
    50
chapter \ref{Bitcoded1} formalised by Ausaf et al.
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    51
relies crucially on lemma \ref{retrieveStepwise} that says
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    52
any value can be retrieved in a stepwise manner, namely:
654
Chengsong
parents: 653
diff changeset
    53
\begin{equation}\label{eq:stepwise}%eqref: this proposition needs to be referred	
Chengsong
parents: 653
diff changeset
    54
	\vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c)  \;  v= \retrieve \; r \; (\inj \; r\; c\; v)
Chengsong
parents: 653
diff changeset
    55
\end{equation}
Chengsong
parents: 653
diff changeset
    56
%This no longer holds once we introduce simplifications.
Chengsong
parents: 653
diff changeset
    57
Simplifications are necessary to control the size of derivatives,
Chengsong
parents: 653
diff changeset
    58
but they also destroy the structures of the regular expressions
Chengsong
parents: 653
diff changeset
    59
such that \ref{eq:stepwise} does not hold.
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    60
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    61
650
Chengsong
parents: 649
diff changeset
    62
We want to prove the correctness of $\blexersimp$ which integrates
Chengsong
parents: 649
diff changeset
    63
$\textit{bsimp}$ by applying it after each call to the derivative:
649
Chengsong
parents: 640
diff changeset
    64
\begin{center}
Chengsong
parents: 640
diff changeset
    65
\begin{tabular}{lcl}
650
Chengsong
parents: 649
diff changeset
    66
	$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
649
Chengsong
parents: 640
diff changeset
    67
$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
Chengsong
parents: 640
diff changeset
    68
\end{tabular}
Chengsong
parents: 640
diff changeset
    69
\begin{tabular}{lcl}
Chengsong
parents: 640
diff changeset
    70
  $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
Chengsong
parents: 640
diff changeset
    71
      $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\                
Chengsong
parents: 640
diff changeset
    72
  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
Chengsong
parents: 640
diff changeset
    73
  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
Chengsong
parents: 640
diff changeset
    74
  & & $\;\;\textit{else}\;\textit{None}$
Chengsong
parents: 640
diff changeset
    75
\end{tabular}
Chengsong
parents: 640
diff changeset
    76
\end{center}
Chengsong
parents: 640
diff changeset
    77
\noindent
650
Chengsong
parents: 649
diff changeset
    78
Previously without $\textit{bsimp}$ the exact structure of each intermediate 
Chengsong
parents: 649
diff changeset
    79
regular expression is preserved, allowing pairs of inhabitation relations in the form $\vdash v : r_{c} $ and
Chengsong
parents: 649
diff changeset
    80
$\vdash v^{c} : r $ to hold in lemma \ref{retrieveStepwise}(if 
Chengsong
parents: 649
diff changeset
    81
we use the convenient notation $r_{c} \dn r\backslash c$
Chengsong
parents: 649
diff changeset
    82
and $v_{r}^{c} \dn \inj \;r \; c \; v$),
Chengsong
parents: 649
diff changeset
    83
but $\blexersimp$ introduces simplification after the derivative,
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    84
making it difficult to align the pairs:
650
Chengsong
parents: 649
diff changeset
    85
\begin{center}
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    86
	$\vdash v: \textit{bsimp} \; r_{c} \implies \retrieve \; (\textit{bsimp} \; r_c) \; v =\retrieve \; r  \;(\mathord{?} v_{r}^{c}) $
650
Chengsong
parents: 649
diff changeset
    87
\end{center}
Chengsong
parents: 649
diff changeset
    88
\noindent
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    89
It is clear that once we made 
650
Chengsong
parents: 649
diff changeset
    90
$v$ to align with $\textit{bsimp} \; r_{c}$
Chengsong
parents: 649
diff changeset
    91
in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
Chengsong
parents: 649
diff changeset
    92
in for the above statement to hold.
Chengsong
parents: 649
diff changeset
    93
Ausaf et al. \cite{AusafUrbanDyckhoff2016}
651
Chengsong
parents: 650
diff changeset
    94
made some initial attempts with this idea, see \cite{FahadThesis}
Chengsong
parents: 650
diff changeset
    95
for details.
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    96
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    97
They added
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    98
and then rectify it to
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
    99
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   100
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   101
this works fine, however that limits the kind of simplifications you can introduce.
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   102
We cannot use their idea for our very strong simplification rules.
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   103
Therefore we take our route
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   104
a wea
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   105
651
Chengsong
parents: 650
diff changeset
   106
The other route is to dispose of lemma \ref{retrieveStepwise},
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   107
and prove a weakened inductive invariant instead.
651
Chengsong
parents: 650
diff changeset
   108
We adopt this approach in this thesis.
649
Chengsong
parents: 640
diff changeset
   109
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   110
Let us first explain why the requirement in $\blexer$'s proof
651
Chengsong
parents: 650
diff changeset
   111
is too strong, and suggest a few possible fixes, which leads to
Chengsong
parents: 650
diff changeset
   112
our proof which we believe was the most natural and effective method.
649
Chengsong
parents: 640
diff changeset
   113
Chengsong
parents: 640
diff changeset
   114
Chengsong
parents: 640
diff changeset
   115
651
Chengsong
parents: 650
diff changeset
   116
\section{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong}
649
Chengsong
parents: 640
diff changeset
   117
651
Chengsong
parents: 650
diff changeset
   118
%From this chapter we start with the main contribution of this thesis, which
Chengsong
parents: 650
diff changeset
   119
Chengsong
parents: 650
diff changeset
   120
The $\blexer$ proof relies on a lockstep POSIX
649
Chengsong
parents: 640
diff changeset
   121
correspondence between the lexical value and the
Chengsong
parents: 640
diff changeset
   122
regular expression in each derivative and injection.
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   123
If we zoom into the diagram \ref{graph:inj} and look specifically at
652
Chengsong
parents: 651
diff changeset
   124
the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating
Chengsong
parents: 651
diff changeset
   125
the invariant that the same bitcodes can be extracted from the pairs:
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   126
\tikzset{three sided/.style={
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   127
        draw=none,
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   128
        append after command={
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   129
            [-,shorten <= -0.5\pgflinewidth]
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   130
            ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   131
        edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) 
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   132
            ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   133
        edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)            
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   134
            ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   135
        edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   136
        }
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   137
    }
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   138
}
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   139
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   140
\tikzset{three sided1/.style={
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   141
        draw=none,
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   142
        append after command={
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   143
            [-,shorten <= -0.5\pgflinewidth]
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   144
            ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   145
        edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) 
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   146
            ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   147
        edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)            
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   148
            ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   149
        edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   150
        }
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   151
    }
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   152
}
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   153
655
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   154
\begin{center}
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   155
	\begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
655
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   156
		\node [rectangle ] (1)  at (-7, 2) {$\ldots$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   157
		\node [rectangle, draw] (2) at  (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   158
		\node [rectangle, draw] (3) at  (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_0a+_1aa)^*$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   159
		\node [rectangle] (4) at  (9, 2) {$\ldots$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   160
		\node [rectangle] (5) at  (-7, -2) {$\ldots$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   161
		\node [rectangle, draw] (6) at  (-4, -2) {$v_i = \Stars \; [\Left (a)]$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   162
		\node [rectangle, draw] (7) at  ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   163
		\node [rectangle] (8) at  ( 9, -2) {$\ldots$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   164
		\node [rectangle] (9) at  (-7, -6) {$\ldots$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   165
		\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = \retrieve \; r_i\;v_i$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   166
		\node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = \retrieve \; r_{i+1}\;v_{i+1}$};
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   167
		\node [rectangle] (12) at  (9, -6) {$\ldots$};
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   168
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   169
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   170
		\path (1) edge [] node {} (2);
655
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   171
		\path (5) edge [] node {} (6);
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   172
		\path (9) edge [] node {} (10);
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   173
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   174
		\path (11) edge [] node {} (12);
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   175
		\path (7) edge [] node {} (8);
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   176
		\path (3) edge [] node {} (4);
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   177
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   178
	
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   179
		\path (2) edge [] node {$\backslash a$} (3);
655
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   180
		\path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6);
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   181
		\path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7);
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   182
		%\path (6) edge [] node {$\vdash v_i : r_i$} (10);
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   183
		%\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11);
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   184
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   185
		\path (10) edge [dashed, <->] node {$=$} (11);
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   186
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   187
		\path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6);
653
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   188
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   189
%		\node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   190
%		\node [rectangle, draw] (a) at (-6, 4)	  {$(aa)^*(_{Z}b + _{S}c)$};
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   191
%		\path	(r)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   192
%			edge [] node {$\internalise$} (a);
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   193
%		\node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   194
%		\path	(a)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   195
%			edge [] node {$\backslash a$} (a1);
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   196
%
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   197
%		\node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   198
%		\node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   199
%		\path	(a1)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   200
%			edge [] node {$\backslash a$} (a21);
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   201
%		\node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   202
%		\path	(a22)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   203
%			edge [] node {$\backslash b$} (a3);
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   204
%		\path	(a21)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   205
%			edge [dashed, bend right] node {} (a3);
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   206
%		\node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   207
%		\path	(a3)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   208
%			edge [below] node {$\bmkeps$} (bs);
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   209
%		\node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   210
%		\path 	(bs)
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   211
%			edge [] node {$\decode$} (v);
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   212
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   213
bc5571c38d1f more updates in section 4.2 and incorporating Christian comments
Chengsong
parents: 652
diff changeset
   214
	\end{tikzpicture}
655
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   215
	%\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
d8f82c690b32 updated 4.2 diagram
Chengsong
parents: 654
diff changeset
   216
\end{center}
649
Chengsong
parents: 640
diff changeset
   217
652
Chengsong
parents: 651
diff changeset
   218
When simplifications are added, the inhabitation relation no longer holds,
Chengsong
parents: 651
diff changeset
   219
causing the above diagram to break.
Chengsong
parents: 651
diff changeset
   220
Chengsong
parents: 651
diff changeset
   221
Ausaf addressed this with an augmented lexer he called $\textit{slexer}$.
Chengsong
parents: 651
diff changeset
   222
Chengsong
parents: 651
diff changeset
   223
649
Chengsong
parents: 640
diff changeset
   224
652
Chengsong
parents: 651
diff changeset
   225
we note that the invariant
Chengsong
parents: 651
diff changeset
   226
$\vdash v_{i+1}: r_{i+1} \implies \retrieve \; r_{i+1} \; v_{i+1} $ is too strong
Chengsong
parents: 651
diff changeset
   227
to maintain because the precondition $\vdash v_i : r_i$ is too weak.
Chengsong
parents: 651
diff changeset
   228
It does not require $v_i$ to be a POSIX value 
Chengsong
parents: 651
diff changeset
   229
651
Chengsong
parents: 650
diff changeset
   230
652
Chengsong
parents: 651
diff changeset
   231
{\color{red} \rule{\linewidth}{0.5mm}}
Chengsong
parents: 651
diff changeset
   232
New content ends
Chengsong
parents: 651
diff changeset
   233
{\color{red} \rule{\linewidth}{0.5mm}}
649
Chengsong
parents: 640
diff changeset
   234
652
Chengsong
parents: 651
diff changeset
   235
651
Chengsong
parents: 650
diff changeset
   236
Chengsong
parents: 650
diff changeset
   237
%
Chengsong
parents: 650
diff changeset
   238
%
Chengsong
parents: 650
diff changeset
   239
%which is essential for getting an understanding this thesis
Chengsong
parents: 650
diff changeset
   240
%in chapter \ref{Bitcoded1}, which is necessary for understanding why
Chengsong
parents: 650
diff changeset
   241
%the proof 
Chengsong
parents: 650
diff changeset
   242
%
Chengsong
parents: 650
diff changeset
   243
%In this chapter,
649
Chengsong
parents: 640
diff changeset
   244
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   245
%We contrast our simplification function 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   246
%with Sulzmann and Lu's, indicating the simplicity of our algorithm.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   247
%This is another case for the usefulness 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   248
%and reliability of formal proofs on algorithms.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   249
%These ``aggressive'' simplifications would not be possible in the injection-based 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   250
%lexing we introduced in chapter \ref{Inj}.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   251
%We then prove the correctness with the improved version of 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   252
%$\blexer$, called $\blexersimp$, by establishing 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   253
%$\blexer \; r \; s= \blexersimp \; r \; s$ using a term rewriting system.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   254
%
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   255
\section{Simplifications by Sulzmann and Lu}
649
Chengsong
parents: 640
diff changeset
   256
The algorithms $\lexer$ and $\blexer$ work beautifully as functional 
Chengsong
parents: 640
diff changeset
   257
programs, but not as practical code. One main reason for the slowness is due
Chengsong
parents: 640
diff changeset
   258
to the size of intermediate representations--the derivative regular expressions
Chengsong
parents: 640
diff changeset
   259
tend to grow unbounded if the matching involved a large number of possible matches.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   260
Consider the derivatives of the following example $(a^*a^*)^*$:
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   261
%and $(a^* + (aa)^*)^*$:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   262
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   263
	\begin{tabular}{lcl}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   264
		$(a^*a^*)^*$ & $ \stackrel{\backslash a}{\longrightarrow}$ & 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   265
		$ (a^*a^* + a^*)\cdot(a^*a^*)^*$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   266
			     & 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   267
		$ \stackrel{\backslash a}{\longrightarrow} $ & 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   268
	$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   269
							     & $\stackrel{\backslash a}{
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   270
	\longrightarrow} $ & $\ldots$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   271
	\end{tabular}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   272
\end{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   273
\noindent
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   274
As can be seen, there are several duplications.
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   275
A simple-minded simplification function cannot simplify
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   276
the third regular expression in the above chain of derivative
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   277
regular expressions, namely
583
Chengsong
parents: 582
diff changeset
   278
\begin{center}
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   279
$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$
583
Chengsong
parents: 582
diff changeset
   280
\end{center}
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   281
because the duplicates are
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   282
not next to each other, and therefore the rule
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   283
$r+ r \rightarrow r$ from $\textit{simp}$ does not fire.
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   284
One would expect a better simplification function to work in the 
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   285
following way:
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   286
\begin{gather*}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   287
	((a^*a^* + \underbrace{a^*}_\text{A})+\underbrace{a^*}_\text{duplicate of A})\cdot(a^*a^*)^* + 
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   288
	\underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this}.\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   289
	\bigg\downarrow (1) \\
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   290
	(a^*a^* + a^* 
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   291
	\color{gray} + a^* \color{black})\cdot(a^*a^*)^* + 
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   292
	\underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this} \\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   293
	\bigg\downarrow (2) \\
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   294
	(a^*a^* + a^* 
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   295
	)\cdot(a^*a^*)^*  
583
Chengsong
parents: 582
diff changeset
   296
	\color{gray} + (a^*a^* + a^*) \cdot(a^*a^*)^*\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   297
	\bigg\downarrow (3) \\
583
Chengsong
parents: 582
diff changeset
   298
	(a^*a^* + a^* 
Chengsong
parents: 582
diff changeset
   299
	)\cdot(a^*a^*)^*  
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   300
\end{gather*}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   301
\noindent
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   302
In the first step, the nested alternative regular expression
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   303
$(a^*a^* + a^*) + a^*$ is flattened into $a^*a^* + a^* + a^*$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   304
Now the third term $a^*$ can clearly be identified as a duplicate
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   305
and therefore removed in the second step. 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   306
This causes the two
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   307
top-level terms to become the same and the second $(a^*a^*+a^*)\cdot(a^*a^*)^*$ 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   308
removed in the final step.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   309
Sulzmann and Lu's simplification function (using our notations) can achieve this
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   310
simplification:
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   311
\begin{center}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   312
	\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   313
		$\textit{simp}\_{SL} \; _{bs}(_{bs'}\ONE \cdot r)$ & $\dn$ & 
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   314
		$\textit{if} \; (\textit{zeroable} \; r)\; \textit{then} \;\; \ZERO$\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   315
						   & &$\textit{else}\;\; \fuse \; (bs@ bs') \; r$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   316
		$\textit{simp}\_{SL} \;(_{bs}r_1\cdot r_2)$ & $\dn$ & $\textit{if} 
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   317
		\; (\textit{zeroable} \; r_1 \; \textit{or} \; \textit{zeroable}\; r_2)\;
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   318
		\textit{then} \;\; \ZERO$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   319
							    & & $\textit{else}\;\;_{bs}((\textit{simp}\_{SL} \;r_1)\cdot
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   320
							    (\textit{simp}\_{SL} \; r_2))$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   321
		$\textit{simp}\_{SL}  \; _{bs}\sum []$ & $\dn$ & $\ZERO$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   322
		$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   323
		$_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   324
		$\textit{simp}\_{SL}  \; _{bs}\sum[r]$ & $\dn$ & $\fuse \; bs \; (\textit{simp}\_{SL}  \; r)$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   325
		$\textit{simp}\_{SL}  \; _{bs}\sum(r::rs)$ & $\dn$ & $_{bs}\sum 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   326
		(\nub \; (\filter \; (\neg\zeroable)\;((\textit{simp}\_{SL}  \; r) :: \map \; \textit{simp}\_{SL}  \; rs)))$\\ 
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   327
		
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   328
	\end{tabular}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   329
\end{center}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   330
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   331
The $\textit{zeroable}$ predicate 
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   332
tests whether the regular expression
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   333
is equivalent to $\ZERO$, and
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   334
can be defined as:
579
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   335
\begin{center}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   336
	\begin{tabular}{lcl}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   337
		$\zeroable \; _{bs}\sum (r::rs)$ & $\dn$ & $\zeroable \; r\;\; \land \;\;
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   338
		\zeroable \;_{[]}\sum\;rs $\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   339
		$\zeroable\;_{bs}(r_1 \cdot r_2)$ & $\dn$ & $\zeroable\; r_1 \;\; \lor \;\; \zeroable \; r_2$\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   340
		$\zeroable\;_{bs}r^*$ & $\dn$ & $\textit{false}$ \\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   341
		$\zeroable\;_{bs}c$ & $\dn$ & $\textit{false}$\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   342
		$\zeroable\;_{bs}\ONE$ & $\dn$ & $\textit{false}$\\
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   343
		$\zeroable\;_{bs}\ZERO$ & $\dn$ & $\textit{true}$
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   344
	\end{tabular}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   345
\end{center}
35df9cdd36ca more chap3
Chengsong
parents: 576
diff changeset
   346
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   347
The 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   348
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   349
	\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   350
		$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   351
		$_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   352
	\end{tabular}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   353
\end{center}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   354
\noindent
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   355
clause does flatten the alternative as required in step (1),
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   356
but $\textit{simp}\_{SL}$ is insufficient if we want to do steps (2) and (3),
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   357
as these ``identical'' terms have different bit-annotations.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   358
They also suggested that the $\textit{simp}\_{SL} $ function should be
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   359
applied repeatedly until a fixpoint is reached.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   360
We call this construction $\textit{SLSimp}$:
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   361
\begin{center}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   362
	\begin{tabular}{lcl}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   363
		$\textit{SLSimp} \; r$ & $\dn$ & 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   364
		$\textit{while}((\textit{simp}\_{SL}  \; r)\; \cancel{=} \; r)$ \\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   365
					 & & $\quad r := \textit{simp}\_{SL}  \; r$\\
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   366
		& & $\textit{return} \; r$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   367
	\end{tabular}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   368
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   369
We call the operation of alternatingly 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   370
applying derivatives and simplifications
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   371
(until the string is exhausted) Sulz-simp-derivative,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   372
written $\backslash_{SLSimp}$:
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   373
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   374
\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   375
	$r \backslash_{SLSimp} (c\!::\!s) $ & $\dn$ & $(\textit{SLSimp} \; (r \backslash c)) \backslash_{SLSimp}\, s$ \\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   376
$r \backslash_{SLSimp} [\,] $ & $\dn$ & $r$
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   377
\end{tabular}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   378
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   379
\noindent
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   380
After the derivatives have been taken, the bitcodes
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   381
are extracted and decoded in the same manner
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   382
as $\blexer$:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   383
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   384
\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   385
  $\textit{blexer\_SLSimp}\;r\,s$ & $\dn$ &
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   386
      $\textit{let}\;a = (r^\uparrow)\backslash_{SLSimp}\, s\;\textit{in}$\\                
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   387
  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   388
  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   389
  & & $\;\;\textit{else}\;\textit{None}$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   390
\end{tabular}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   391
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   392
\noindent
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   393
We implemented this lexing algorithm in Scala, 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   394
and found that the final derivative regular expression
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   395
size still grows exponentially (note the logarithmic scale):
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   396
\begin{figure}[H]
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   397
	\centering
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   398
\begin{tikzpicture}
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   399
\begin{axis}[
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   400
    xlabel={$n$},
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   401
    ylabel={size},
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   402
    ymode = log,
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   403
    legend entries={Final Derivative Size},  
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   404
    legend pos=north west,
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   405
    legend cell align=left]
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   406
\addplot[red,mark=*, mark options={fill=white}] table {SulzmannLuLexer.data};
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   407
\end{axis}
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   408
\end{tikzpicture} 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   409
\caption{Lexing the regular expression $(a^*a^*)^*$ against strings of the form
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   410
$\protect\underbrace{aa\ldots a}_\text{n \textit{a}s}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   411
$ using Sulzmann and Lu's lexer}\label{SulzmannLuLexer}
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   412
\end{figure}
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   413
\noindent
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   414
At $n= 20$ we already get an out-of-memory error with Scala's normal 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   415
JVM heap size settings.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   416
In fact their simplification does not improve much over
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   417
the simple-minded simplifications we have shown in \ref{fig:BetterWaterloo}.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   418
The time required also grows exponentially:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   419
\begin{figure}[H]
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   420
	\centering
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   421
\begin{tikzpicture}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   422
\begin{axis}[
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   423
    xlabel={$n$},
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   424
    ylabel={time},
601
Chengsong
parents: 600
diff changeset
   425
    %ymode = log,
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   426
    legend entries={time in secs},  
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   427
    legend pos=north west,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   428
    legend cell align=left]
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   429
\addplot[red,mark=*, mark options={fill=white}] table {SulzmannLuLexerTime.data};
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   430
\end{axis}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   431
\end{tikzpicture} 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   432
\caption{Lexing the regular expression $(a^*a^*)^*$ against strings of the form
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   433
$\protect\underbrace{aa\ldots a}_\text{n \textit{a}s}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   434
$ using Sulzmann and Lu's lexer}\label{SulzmannLuLexerTime}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   435
\end{figure}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   436
\noindent
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   437
which seems like a counterexample for 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   438
Sulzmann and Lu's linear complexity claim
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   439
in their paper \cite{Sulzmann2014}:
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   440
\begin{quote}\it
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   441
``Linear-Time Complexity Claim \\It is easy to see that each call of one of the functions/operations:
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   442
simp, fuse, mkEpsBC and isPhi leads to subcalls whose number is bound by the size of the regular expression involved. We claim that thanks to aggressively applying simp this size remains finite. Hence, we can argue that the above mentioned functions/operations have constant time complexity which implies that we can incrementally compute bit-coded parse trees in linear time in the size of the input.'' 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   443
\end{quote}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   444
\noindent
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   445
The assumption that the size of the regular expressions
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   446
in the algorithm
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   447
would stay below a finite constant is not true, at least not in the
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   448
examples we considered.
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   449
The main reason behind this is that (i) Haskell's $\textit{nub}$
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   450
function requires identical annotations between two 
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   451
annotated regular expressions to qualify as duplicates,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   452
and therefore cannot simplify cases like $_{SZZ}a^*+_{SZS}a^*$
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   453
even if both $a^*$ denote the same language, and
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   454
(ii) the ``flattening'' only applies to the head of the list
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   455
in the 
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   456
\begin{center}
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   457
	\begin{tabular}{lcl}
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   458
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   459
		$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   460
		$_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   461
	\end{tabular}
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   462
\end{center}
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   463
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   464
clause, and therefore is not strong enough to simplify all
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   465
needed parts of the regular expression. Moreover,
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   466
the $\textit{simp}\_{SL}$ function is applied repeatedly
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   467
in each derivative step until a fixed point is reached, 
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   468
which makes the algorithm even more
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   469
unpredictable and inefficient.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   470
%To not get ``caught off guard'' by
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   471
%these counterexamples,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   472
%one needs to be more careful when designing the
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   473
%simplification function and making claims about them.
584
1734bd5975a3 chap4 nub
Chengsong
parents: 583
diff changeset
   474
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   475
\section{Our $\textit{Simp}$ Function}
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   476
We will now introduce our own simplification function.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   477
%by making a contrast with $\textit{simp}\_{SL}$.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   478
We also describe
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   479
the ideas behind Sulzmann and Lu's $\textit{simp}\_{SL}$
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   480
algorithm 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   481
and why it fails to achieve the desired effect of keeping the sizes of derivatives finitely bounded. 
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   482
In addition, our simplification function will come with a formal
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   483
correctness proof.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   484
\subsection{Flattening Nested Alternatives}
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   485
The idea behind the clause
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   486
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   487
	$\textit{simp}\_{SL}  \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2) \quad \dn \quad
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   488
	       _{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   489
\end{center}
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   490
is that it allows
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   491
duplicate removal of regular expressions at different
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   492
``levels'' of alternatives.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   493
For example, this would help with the
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   494
following simplification:
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   495
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   496
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   497
$(a+r)+r \longrightarrow a+r$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   498
\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   499
The problem is that only the head element
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   500
is ``spilled out''.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   501
It is more desirable
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   502
to flatten
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   503
an entire list to open up possibilities for further simplifications
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   504
with later regular expressions.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   505
Not flattening the rest of the elements also means that
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   506
the later de-duplication process 
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   507
does not fully remove further duplicates.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   508
For example,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   509
using $\textit{simp}\_{SL}$ we cannot
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   510
simplify
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   511
\begin{center}
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   512
	$((a^* a^*)+\underline{(a^* + a^*)})\cdot (a^*a^*)^*+
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   513
((a^*a^*)+a^*)\cdot (a^*a^*)^*$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   514
\end{center}
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   515
due to the underlined part not being the head 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   516
of the alternative.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   517
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   518
We define our flatten operation so that it flattens 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   519
the entire list: 
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   520
 \begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   521
  \begin{tabular}{@{}lcl@{}}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   522
  $\textit{flts} \; (_{bs}\sum \textit{as}) :: \textit{as'}$ & $\dn$ & $(\textit{map} \;
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   523
     (\textit{fuse}\;bs)\; \textit{as}) \; @ \; \textit{flts} \; as' $ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   524
  $\textit{flts} \; \ZERO :: as'$ & $\dn$ & $ \textit{flts} \;  \textit{as'} $ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   525
    $\textit{flts} \; a :: as'$ & $\dn$ & $a :: \textit{flts} \; \textit{as'}$ \quad(otherwise) 
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   526
\end{tabular}    
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   527
\end{center}  
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   528
\noindent
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   529
Our $\flts$ operation 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   530
also throws away $\ZERO$s
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   531
as they do not contribute to a lexing result.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   532
\subsection{Duplicate Removal}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   533
After flattening is done, we can deduplicate.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   534
The de-duplicate function is called $\distinctBy$,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   535
and that is where we make our second improvement over
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   536
Sulzmann and Lu's simplification method.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   537
The process goes as follows:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   538
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   539
$rs \stackrel{\textit{flts}}{\longrightarrow} 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   540
rs_{flat} 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   541
\xrightarrow{\distinctBy \; 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   542
rs_{flat} \; \rerases\; \varnothing} rs_{distinct}$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   543
%\stackrel{\distinctBy \; 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   544
%rs_{flat} \; \erase\; \varnothing}{\longrightarrow} \; rs_{distinct}$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   545
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   546
where the $\distinctBy$ function is defined as:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   547
\begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   548
	\begin{tabular}{@{}lcl@{}}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   549
		$\distinctBy \; [] \; f\; acc $ & $ =$ & $ []$\\
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   550
		$\distinctBy \; (x :: xs) \; f \; acc$ & $=$ & $\quad \textit{if} (f \; x \in acc)\;\; \textit{then} \;\; \distinctBy \; xs \; f \; acc$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   551
						       & & $\quad \textit{else}\;\; x :: (\distinctBy \; xs \; f \; (\{f \; x\} \cup acc))$ 
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   552
	\end{tabular}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   553
\end{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   554
\noindent
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   555
The reason we define a distinct function under a mapping $f$ is because
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   556
we want to eliminate regular expressions that are syntactically the same,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   557
but have different bit-codes.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   558
For example, we can remove the second $a^*a^*$ from
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   559
$_{ZSZ}a^*a^* + _{SZZ}a^*a^*$, because it
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   560
represents a match with shorter initial sub-match 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   561
(and therefore is definitely not POSIX),
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   562
and will be discarded by $\bmkeps$ later.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   563
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   564
	$_{ZSZ}\underbrace{a^*}_{ZS:\; match \; 1\; times\quad}\underbrace{a^*}_{Z: \;match\; 1 \;times} + 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   565
	_{SZZ}\underbrace{a^*}_{S: \; match \; 0 \; times\quad}\underbrace{a^*}_{ZZ: \; match \; 2 \; times}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   566
	$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   567
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   568
%$_{bs1} r_1 + _{bs2} r_2 \text{where} (r_1)_{\downarrow} = (r_2)_{\downarrow}$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   569
Due to the way our algorithm works,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   570
the matches that conform to the POSIX standard 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   571
will always be placed further to the left. When we 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   572
traverse the list from left to right,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   573
regular expressions we have already seen
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   574
will definitely not contribute to a POSIX value,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   575
even if they are attached with different bitcodes.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   576
These duplicates therefore need to be removed.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   577
To achieve this, we call $\rerases$ as the function $f$ during the distinction
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   578
operation. The function
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   579
$\rerases$ is very similar to $\erase$, except that it preserves the structure
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   580
when erasing an alternative regular expression.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   581
The reason why we use $\rerases$ instead of $\erase$ is that
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   582
it keeps the structures of alternative 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   583
annotated regular expressions
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   584
whereas $\erase$ would turn it back into a binary  tree structure.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   585
Not having to mess with the structure 
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   586
greatly simplifies the finiteness proof in chapter 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   587
\ref{Finite}.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   588
We give the definitions of $\rerases$ here together with
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   589
the new datatype used by $\rerases$ (as our plain
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   590
regular expression datatype does not allow non-binary alternatives).
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   591
For now we can think of 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   592
$\rerases$ as the function $(\_)_\downarrow$ defined in chapter \ref{Bitcoded1}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   593
and $\rrexp$ as plain regular expressions, but having a general list constructor
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   594
for alternatives:
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   595
\begin{figure}[H]
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   596
\begin{center}	
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   597
	$\rrexp ::=   \RZERO \mid  \RONE
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   598
			 \mid  \RCHAR{c}  
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   599
			 \mid  \RSEQ{r_1}{r_2}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   600
			 \mid  \RALTS{rs}
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   601
			 \mid \RSTAR{r}        $
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   602
\end{center}
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   603
\caption{$\rrexp$: plain regular expressions, but with $\sum$ alternative 
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   604
constructor}\label{rrexpDef}
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 589
diff changeset
   605
\end{figure}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   606
The function $\rerases$ we define as follows:
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   607
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   608
\begin{tabular}{lcl}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   609
$\rerase{\ZERO}$ & $\dn$ & $\RZERO$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   610
$\rerase{_{bs}\ONE}$ & $\dn$ & $\RONE$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   611
	$\rerase{_{bs}\mathbf{c}}$ & $\dn$ & $\RCHAR{c}$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   612
$\rerase{_{bs}r_1\cdot r_2}$ & $\dn$ & $\RSEQ{\rerase{r_1}}{\rerase{r_2}}$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   613
$\rerase{_{bs}\sum as}$ & $\dn$ & $\RALTS{\map \; \rerase{\_} \; as}$\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   614
$\rerase{_{bs} a ^*}$ & $\dn$ & $\rerase{a}^*$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   615
\end{tabular}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   616
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   617
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   618
\subsection{Putting Things Together}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   619
We can now give the definition of our  simplification function:
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   620
%that looks somewhat similar to our Scala code is 
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   621
\begin{center}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   622
  \begin{tabular}{@{}lcl@{}}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   623
   
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   624
	  $\textit{bsimp} \; (_{bs}a_1\cdot a_2)$ & $\dn$ & $ \textit{bsimp}_{ASEQ} \; bs \;(\textit{bsimp} \; a_1) \; (\textit{bsimp}  \; a_2)  $ \\
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   625
	  $\textit{bsimp} \; (_{bs}\sum \textit{as})$ & $\dn$ & $\textit{bsimp}_{ALTS} \; \textit{bs} \; (\textit{distinctBy} \; ( \textit{flatten} ( \textit{map} \; bsimp \; as)) \; \rerases \; \varnothing) $ \\
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   626
   $\textit{bsimp} \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$   
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   627
\end{tabular}    
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   628
\end{center}    
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   629
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   630
\noindent
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   631
The simplification (named $\textit{bsimp}$ for \emph{b}it-coded) 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   632
does a pattern matching on the regular expression.
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   633
When it detects that the regular expression is an alternative or
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   634
sequence, it will try to simplify its children regular expressions
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   635
recursively and then see if one of the children turns into $\ZERO$ or
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   636
$\ONE$, which might trigger further simplification at the current level.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   637
Current level simplifications are handled by the function $\textit{bsimp}_{ASEQ}$,
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   638
using rules such as  $\ZERO \cdot r \rightarrow \ZERO$ and $\ONE \cdot r \rightarrow r$.
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   639
\begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   640
	\begin{tabular}{@{}lcl@{}}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   641
		$\textit{bsimp}_{ASEQ} \; bs\; a \; b$ & $\dn$ & $ (a,\; b) \textit{match}$\\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   642
   &&$\quad\textit{case} \; (\ZERO, \_) \Rightarrow  \ZERO$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   643
   &&$\quad\textit{case} \; (\_, \ZERO) \Rightarrow  \ZERO$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   644
   &&$\quad\textit{case} \;  (_{bs1}\ONE, a_2') \Rightarrow  \textit{fuse} \; (bs@bs_1) \;  a_2'$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   645
   &&$\quad\textit{case} \; (a_1', a_2') \Rightarrow   _{bs}a_1' \cdot a_2'$ 
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   646
	\end{tabular}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   647
\end{center}
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   648
\noindent
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   649
The most involved part is the $\sum$ clause, where we first call $\flts$ on
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   650
the simplified children regular expression list $\textit{map}\; \textit{bsimp}\; \textit{as}$,
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   651
and then call $\distinctBy$ on that list. The predicate used in $\distinctBy$ for determining whether two 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   652
elements are the same is $\rerases \; r_1 = \rerases\; r_2$.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   653
Finally, depending on whether the regular expression list $as'$ has turned into a
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   654
singleton or empty list after $\flts$ and $\distinctBy$, $\textit{bsimp}_{ALTS}$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   655
decides whether to keep the current level constructor $\sum$ as it is, and 
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   656
removes it when there are fewer than two elements:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   657
\begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   658
	\begin{tabular}{lcl}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   659
		$\textit{bsimp}_{ALTS} \; bs \; as'$ & $ \dn$ & $ as' \; \textit{match}$\\		
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   660
  &&$\quad\textit{case} \; [] \Rightarrow  \ZERO$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   661
   &&$\quad\textit{case} \; a :: [] \Rightarrow  \textit{fuse bs a}$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   662
   &&$\quad\textit{case} \;  as' \Rightarrow _{bs}\sum \textit{as'}$\\ 
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   663
	\end{tabular}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   664
	
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   665
\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   666
Having defined the $\textit{bsimp}$ function,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   667
we add it as a phase after a derivative is taken.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   668
\begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   669
	\begin{tabular}{lcl}
649
Chengsong
parents: 640
diff changeset
   670
		$r \backslash_{bsimp} c$ & $\dn$ & $\textit{bsimp}(r \backslash c)$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   671
	\end{tabular}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   672
\end{center}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   673
%Following previous notations
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   674
%when extending from derivatives w.r.t.~character to derivative
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   675
%w.r.t.~string, we define the derivative that nests simplifications 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   676
%with derivatives:%\comment{simp in  the [] case?}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   677
We extend this from characters to strings:
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   678
\begin{center}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   679
\begin{tabular}{lcl}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   680
$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(r \backslash_{bsimp}\, c) \backslash_{bsimps}\, s$ \\
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   681
$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   682
\end{tabular}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   683
\end{center}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   684
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   685
\noindent
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   686
The lexer that extracts bitcodes from the 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   687
derivatives with simplifications from our $\simp$ function
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   688
is called $\blexersimp$:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   689
\begin{center}
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   690
\begin{tabular}{lcl}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   691
  $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   692
      $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\                
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   693
  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   694
  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   695
  & & $\;\;\textit{else}\;\textit{None}$
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   696
\end{tabular}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   697
\end{center}
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   698
\noindent
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   699
This algorithm keeps the regular expression size small, 
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   700
as we shall demonstrate with some examples in the next section.
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   701
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   702
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   703
\subsection{Examples $(a+aa)^*$ and $(a^*\cdot a^*)^*$
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   704
After Simplification}
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   705
Recall the
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   706
previous $(a^*a^*)^*$ example
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   707
where $\textit{simp}\_{SL}$ could not
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   708
prevent the fast growth (over
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   709
3 million nodes just below $20$ input length)
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   710
will be reduced to just 15 and stays constant no matter how long the
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   711
input string is.
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   712
This is shown in the graphs below.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   713
\begin{figure}[H]
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   714
\begin{center}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   715
\begin{tabular}{ll}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   716
\begin{tikzpicture}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   717
\begin{axis}[
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   718
    xlabel={$n$},
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   719
    ylabel={derivative size},
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   720
        width=7cm,
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   721
    height=4cm, 
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   722
    legend entries={Lexer with $\textit{bsimp}$},  
539
Chengsong
parents: 538
diff changeset
   723
    legend pos=  south east,
Chengsong
parents: 538
diff changeset
   724
    legend cell align=left]
Chengsong
parents: 538
diff changeset
   725
\addplot[red,mark=*, mark options={fill=white}] table {BitcodedLexer.data};
Chengsong
parents: 538
diff changeset
   726
\end{axis}
Chengsong
parents: 538
diff changeset
   727
\end{tikzpicture} %\label{fig:BitcodedLexer}
Chengsong
parents: 538
diff changeset
   728
&
Chengsong
parents: 538
diff changeset
   729
\begin{tikzpicture}
Chengsong
parents: 538
diff changeset
   730
\begin{axis}[
Chengsong
parents: 538
diff changeset
   731
    xlabel={$n$},
Chengsong
parents: 538
diff changeset
   732
    ylabel={derivative size},
Chengsong
parents: 538
diff changeset
   733
    width = 7cm,
Chengsong
parents: 538
diff changeset
   734
    height = 4cm,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   735
    legend entries={Lexer with $\textit{simp}\_{SL}$},  
539
Chengsong
parents: 538
diff changeset
   736
    legend pos=  north west,
Chengsong
parents: 538
diff changeset
   737
    legend cell align=left]
Chengsong
parents: 538
diff changeset
   738
\addplot[red,mark=*, mark options={fill=white}] table {BetterWaterloo.data};
Chengsong
parents: 538
diff changeset
   739
\end{axis}
Chengsong
parents: 538
diff changeset
   740
\end{tikzpicture} 
Chengsong
parents: 538
diff changeset
   741
\end{tabular}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   742
\end{center}
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   743
\caption{Our Improvement over Sulzmann and Lu's in terms of size of the derivatives.}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   744
\end{figure}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   745
\noindent
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   746
Given the size difference, it is not
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   747
surprising that our $\blexersimp$ significantly outperforms
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   748
$\textit{blexer\_SLSimp}$ by Sulzmann and Lu.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   749
In the next section we are going to establish that our
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   750
simplification preserves the correctness of the algorithm.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   751
%----------------------------------------------------------------------------------------
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   752
%	SECTION rewrite relation
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   753
%----------------------------------------------------------------------------------------
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   754
\section{Correctness of $\blexersimp$}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   755
We first introduce the rewriting relation \emph{rrewrite}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   756
($\rrewrite$) between two regular expressions,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   757
which stands for an atomic
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   758
simplification.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   759
We then prove properties about
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   760
this rewriting relation and its reflexive transitive closure.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   761
Finally we leverage these properties to show
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   762
an equivalence between the results generated by
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   763
$\blexer$ and $\blexersimp$.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   764
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   765
\subsection{The Rewriting Relation $\rrewrite$($\rightsquigarrow$)}
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
   766
In the $\blexer$'s correctness proof, we
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   767
did not directly derive the fact that $\blexer$ generates the POSIX value,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   768
but first proved that $\blexer$ generates the same result as $\lexer$.
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
   769
Then we re-use
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   770
the correctness of $\lexer$
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   771
to obtain 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   772
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   773
	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer \; r \;s = v$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   774
	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer\;
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   775
	r\;s = \None$.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   776
\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   777
%\begin{center}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   778
%	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer \; r \;s = v$.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   779
%\end{center}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   780
Here we apply this
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   781
modularised technique again
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   782
by first proving that
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
   783
$\blexersimp \; r \; s $ 
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
   784
produces the same output as $\blexer \; r\; s$,
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   785
and then piecing it together with 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   786
$\blexer$'s correctness to achieve our main
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   787
theorem:
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
   788
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   789
	$(r, s) \rightarrow v \; \;   \textit{iff} \;\;  \blexersimp \; r \; s = \Some \;v$
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   790
	\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   791
	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp\;
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   792
	r\;s = \None$
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
   793
\end{center}
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
   794
\noindent
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   795
The overall idea for the proof
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   796
of $\blexer \;r \;s = \blexersimp \; r \;s$ 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   797
is that the transition from $r$ to $\textit{bsimp}\; r$ can be
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   798
broken down into smaller rewrite steps of the form:
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   799
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   800
	$r \rightsquigarrow^* \textit{bsimp} \; r$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   801
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   802
where each rewrite step, written $\rightsquigarrow$,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   803
is an ``atomic'' simplification that
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   804
is similar to a small-step reduction in operational semantics (
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   805
see figure \ref{rrewriteRules} for the rules):
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   806
\begin{figure}[H]
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   807
\begin{mathpar}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   808
	\inferrule * [Right = $S\ZERO_l$]{\vspace{0em}}{_{bs} \ZERO \cdot r_2 \rightsquigarrow \ZERO\\}
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
   809
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   810
	\inferrule * [Right = $S\ZERO_r$]{\vspace{0em}}{_{bs} r_1 \cdot \ZERO \rightsquigarrow \ZERO\\}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   811
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   812
	\inferrule * [Right = $S_1$]{\vspace{0em}}{_{bs1} ((_{bs2} \ONE) \cdot r) \rightsquigarrow \fuse \; (bs_1 @ bs_2) \; r\\}\\
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   813
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   814
	
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   815
	
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   816
	\inferrule * [Right = $SL$] {\\ r_1 \rightsquigarrow r_2}{_{bs} r_1 \cdot r_3 \rightsquigarrow _{bs} r_2 \cdot r_3\\}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   817
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   818
	\inferrule * [Right = $SR$] {\\ r_3 \rightsquigarrow r_4}{_{bs} r_1 \cdot r_3 \rightsquigarrow _{bs} r_1 \cdot r_4\\}\\
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   819
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   820
	\inferrule * [Right = $A0$] {\vspace{0em}}{ _{bs}\sum [] \rightsquigarrow \ZERO}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   821
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   822
	\inferrule * [Right = $A1$] {\vspace{0em}}{ _{bs}\sum [a] \rightsquigarrow \fuse \; bs \; a}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   823
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   824
	\inferrule * [Right = $AL$] {\\ rs_1 \stackrel{s}{\rightsquigarrow} rs_2}{_{bs}\sum rs_1 \rightsquigarrow rs_2}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   825
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   826
	\inferrule * [Right = $LE$] {\vspace{0em}}{ [] \stackrel{s}{\rightsquigarrow} []}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   827
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   828
	\inferrule * [Right = $LT$] {rs_1 \stackrel{s}{\rightsquigarrow} rs_2}{ r :: rs_1 \stackrel{s}{\rightsquigarrow} r :: rs_2 }
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   829
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   830
	\inferrule * [Right = $LH$] {r_1 \rightsquigarrow r_2}{ r_1 :: rs \stackrel{s}{\rightsquigarrow} r_2 :: rs}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   831
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   832
	\inferrule * [Right = $L\ZERO$] {\vspace{0em}}{\ZERO :: rs \stackrel{s}{\rightsquigarrow} rs}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   833
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   834
	\inferrule * [Right = $LS$] {\vspace{0em}}{_{bs} \sum (rs_1 :: rs_b) \stackrel{s}{\rightsquigarrow} ((\map \; (\fuse \; bs_1) \; rs_1) @ rsb) }
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   835
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   836
	\inferrule * [Right = $LD$] {\\ \rerase{a_1} = \rerase{a_2}}{rs_a @ [a_1] @ rs_b @ [a_2] @ rs_c \stackrel{s}{\rightsquigarrow} rs_a @ [a_1] @ rs_b @ rs_c}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   837
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   838
\end{mathpar}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   839
\caption{
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   840
The rewrite rules that generate simplified regular expressions 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   841
in small steps: $r_1 \rightsquigarrow r_2$ is for bitcoded regular expressions 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   842
and $rs_1 \stackrel{s}{\rightsquigarrow} rs_2$ for 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   843
lists of bitcoded regular expressions. 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   844
Interesting is the LD rule that allows copies of regular expressions 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   845
to be removed provided a regular expression 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   846
earlier in the list can match the same strings.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   847
}\label{rrewriteRules}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   848
\end{figure}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   849
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   850
The rules $LT$ and $LH$ are for rewriting two regular expression lists
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   851
such that one regular expression
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   852
in the left-hand-side list is rewritable in one step
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   853
to the right-hand side's regular expression at the same position.
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   854
This helps with defining the ``context rule'' $AL$.
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
   855
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   856
The reflexive transitive closure of $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   857
are defined in the usual way:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   858
\begin{figure}[H]
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   859
	\centering
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   860
\begin{mathpar}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   861
	\inferrule{\vspace{0em}}{ r \rightsquigarrow^* r \\}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   862
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   863
	\inferrule{\vspace{0em}}{rs \stackrel{s*}{\rightsquigarrow} rs \\}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   864
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   865
	\inferrule{r_1 \rightsquigarrow^*  r_2 \land \; r_2 \rightsquigarrow^* r_3}{r_1 \rightsquigarrow^* r_3\\}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   866
	
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   867
	\inferrule{rs_1 \stackrel{s*}{\rightsquigarrow}  rs_2 \land \; rs_2 \stackrel{s*}{\rightsquigarrow} rs_3}{rs_1 \stackrel{s*}{\rightsquigarrow} rs_3}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   868
\end{mathpar}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   869
\caption{The Reflexive Transitive Closure of 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   870
$\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$}\label{transClosure}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   871
\end{figure}
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   872
%Two rewritable terms will remain rewritable to each other
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   873
%even after a derivative is taken:
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   874
The main point of our rewriting relation
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   875
is that it is preserved under derivatives,
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
   876
namely
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   877
\begin{center}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   878
	$r_1 \rightsquigarrow r_2 \implies (r_1 \backslash c) \rightsquigarrow^* (r_2 \backslash c)$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   879
\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   880
And also, if two terms are rewritable to each other,
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   881
then they produce the same bitcodes:
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   882
\begin{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   883
	$r \rightsquigarrow^* r' \;\; \textit{then} \; \; \bmkeps \; r = \bmkeps \; r'$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   884
\end{center}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   885
The decoding phase of both $\blexer$ and $\blexersimp$
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   886
are the same, which means that if they receive the same
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   887
bitcodes before the decoding phase,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   888
they generate the same value after decoding is done.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   889
We will prove the three properties 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   890
we mentioned above in the next sub-section.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   891
\subsection{Important Properties of $\rightsquigarrow$}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   892
First we prove some basic facts 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   893
about $\rightsquigarrow$, $\stackrel{s}{\rightsquigarrow}$, 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   894
$\rightsquigarrow^*$ and $\stackrel{s*}{\rightsquigarrow}$,
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   895
which will be needed later.\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   896
The inference rules (\ref{rrewriteRules}) we 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   897
gave in the previous section 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   898
have their ``many-steps version'':
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   899
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   900
\begin{lemma}\label{squig1}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   901
	\hspace{0em}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   902
	\begin{itemize}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   903
		\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   904
			$rs_1 \stackrel{s*}{\rightsquigarrow} rs_2 \implies _{bs} \sum rs_1 \stackrel{*}{\rightsquigarrow} _{bs} \sum rs_2$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   905
		\item
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   906
			$r \rightsquigarrow^* r' \implies _{bs} \sum (r :: rs)\; \rightsquigarrow^*\;  _{bs} \sum (r' :: rs)$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   907
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   908
		\item
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   909
			The rewriting in many steps property is composable 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   910
			in terms of the sequence constructor:\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   911
			$r_1 \rightsquigarrow^* r_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   912
			\implies _{bs} r_1 \cdot r_3 \rightsquigarrow^* \;  
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   913
			_{bs} r_2 \cdot r_3 \quad $ 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   914
			and 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   915
			$\quad r_3 \rightsquigarrow^* r_4 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   916
			\implies _{bs} r_1 \cdot r_3 \rightsquigarrow^* _{bs} \; r_1 \cdot r_4$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   917
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   918
			The rewriting in many steps properties 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   919
			$\stackrel{*}{\rightsquigarrow}$ and $\stackrel{s*}{\rightsquigarrow}$ 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   920
			is preserved under the function $\fuse$:\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   921
				$r_1 \rightsquigarrow^* r_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   922
				\implies \fuse \; bs \; r_1 \rightsquigarrow^* \; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   923
				\fuse \; bs \; r_2 \quad  $ and 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   924
				$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   925
				\implies \map \; (\fuse \; bs) \; rs_1 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   926
				\stackrel{s*}{\rightsquigarrow} \map \; (\fuse \; bs) \; rs_2$
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   927
	\end{itemize}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   928
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   929
\begin{proof}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   930
	By an induction on 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   931
	the inductive cases of $\stackrel{s*}{\rightsquigarrow}$ and $\rightsquigarrow^*$ respectively.
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   932
	The third and fourth points are 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   933
	by the properties $r_1 \rightsquigarrow r_2 \implies \fuse \; bs \; r_1 \implies \fuse \; bs \; r_2$ and
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   934
	$rs_2 \stackrel{s}{\rightsquigarrow} rs_3 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   935
	\implies \map \; (\fuse \; bs) rs_2 \stackrel{s*}{\rightsquigarrow} \map \; (\fuse \; bs)\; rs_3$,
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
   936
	which can be inductively proven by the inductive cases of $\rightsquigarrow$ and 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   937
	$\stackrel{s}{\rightsquigarrow}$.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   938
\end{proof}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   939
\noindent
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   940
The inference rules of $\stackrel{s}{\rightsquigarrow}$
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   941
are defined in terms of the list cons operation, where
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   942
we establish that the 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   943
$\stackrel{s}{\rightsquigarrow}$ and $\stackrel{s*}{\rightsquigarrow}$ 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   944
relation is also preserved w.r.t appending and prepending of a list.
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   945
In addition, we
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   946
also prove some relations 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   947
between $\rightsquigarrow^*$ and $\stackrel{s*}{\rightsquigarrow}$.
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   948
\begin{lemma}\label{ssgqTossgs}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   949
	\hspace{0em}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   950
	\begin{itemize}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   951
		\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   952
			$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies rs @ rs_1 \stackrel{s}{\rightsquigarrow} rs @ rs_2$
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   953
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   954
		\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   955
			$rs_1 \stackrel{s*}{\rightsquigarrow} rs_2 \implies 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   956
			rs @ rs_1 \stackrel{s*}{\rightsquigarrow} rs @ rs_2 \; \;
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   957
			\textit{and} \; \;
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   958
			rs_1 @ rs \stackrel{s*}{\rightsquigarrow} rs_2 @ rs$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   959
			
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   960
		\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   961
			The $\stackrel{s}{\rightsquigarrow} $ relation after appending 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   962
			a list becomes $\stackrel{s*}{\rightsquigarrow}$:\\
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   963
			$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   964
			\implies rs_1 @ rs \stackrel{s*}{\rightsquigarrow} rs_2 @ rs$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   965
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   966
		
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   967
			$r_1 \rightsquigarrow^* r_2 \implies [r_1] \stackrel{s*}{\rightsquigarrow} [r_2]$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   968
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   969
		
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   970
			$rs_3 \stackrel{s*}{\rightsquigarrow} rs_4 \land r_1 \rightsquigarrow^* r_2 \implies
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   971
			r_2 :: rs_3 \stackrel{s*}{\rightsquigarrow} r_2 :: rs_4$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   972
		\item			
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   973
			If we can rewrite a regular expression 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   974
			in many steps to $\ZERO$, then 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   975
			we can also rewrite any sequence containing it to $\ZERO$:\\
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   976
			$r_1 \rightsquigarrow^* \ZERO 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   977
			\implies _{bs}r_1\cdot r_2 \rightsquigarrow^* \ZERO$
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   978
	\end{itemize}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   979
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   980
\begin{proof}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   981
	The first part is by induction on the list $rs$.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   982
	The second part is by induction on the inductive cases of $\stackrel{s*}{\rightsquigarrow}$.
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   983
	The third part is 
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   984
	by rule induction of $\stackrel{s}{\rightsquigarrow}$.
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   985
	The fourth sub-lemma is 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   986
	by rule induction of 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   987
	$\stackrel{s*}{\rightsquigarrow}$ and using part one to three. 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   988
	The fifth part is a corollary of part four.
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   989
	The last part is proven by rule induction again on $\rightsquigarrow^*$.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
   990
\end{proof}
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
   991
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   992
Now we are ready to give the proofs of the following properties:
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   993
\begin{itemize}
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   994
	\item
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
   995
		$r \rightsquigarrow^* r'\land \bnullable \; r_1 
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   996
		\implies \bmkeps \; r = \bmkeps \; r'$. \\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   997
	\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   998
		$r \rightsquigarrow^* \textit{bsimp} \;r$.\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
   999
	\item
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1000
		$r \rightsquigarrow r' \implies r \backslash c \rightsquigarrow^* r'\backslash c$.\\
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1001
\end{itemize}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1002
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1003
\subsubsection{Property 1: $r \rightsquigarrow^* r'\land \bnullable \; r_1 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1004
		\implies \bmkeps \; r = \bmkeps \; r'$}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1005
Intuitively, this property says we can 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1006
extract the same bitcodes using $\bmkeps$ from the nullable
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1007
components of two regular expressions $r$ and $r'$,
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1008
if we can rewrite from one to the other in finitely
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1009
many steps.
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1010
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1011
For convenience, 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1012
we define a predicate for a list of regular expressions
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1013
having at least one nullable regular expression:
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1014
\begin{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1015
	$\textit{bnullables} \; rs \quad \dn \quad \exists r \in rs. \;\; \bnullable \; r$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1016
\end{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1017
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1018
The rewriting relation $\rightsquigarrow$ preserves (b)nullability:
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1019
\begin{lemma}\label{rewritesBnullable}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1020
	\hspace{0em}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1021
	\begin{itemize}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1022
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1023
			$\text{If} \; r_1 \rightsquigarrow r_2, \; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1024
			\text{then} \; \bnullable \; r_1 = \bnullable \; r_2$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1025
		\item 	
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1026
			$\text{If} \; rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \;
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1027
			\text{then} \; \textit{bnullables} \; rs_1 = \textit{bnullables} \; rs_2$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1028
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1029
			$r_1 \rightsquigarrow^* r_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1030
			\implies \bnullable \; r_1 = \bnullable \; r_2$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1031
	\end{itemize}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1032
\end{lemma}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1033
\begin{proof}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1034
	By rule induction of $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$.
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1035
	The third point is a result of the second.
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1036
\end{proof}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1037
\noindent
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1038
For convenience again,
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1039
we define $\bmkepss$ on a list $rs$,
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1040
which extracts the bit-codes on the first $\bnullable$ element in $rs$:
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1041
\begin{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1042
	\begin{tabular}{lcl}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1043
		$\bmkepss \; [] $ & $\dn$ & $[]$\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1044
		$\bmkepss \; r :: rs$ & $\dn$ & $\textit{if} \;(\bnullable \; r) \;\; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1045
		\textit{then} \;\; \bmkeps \; r \; \textit{else} \;\; \bmkepss \; rs$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1046
	\end{tabular}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1047
\end{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1048
\noindent
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1049
If both regular expressions in a rewriting relation are nullable, then they 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1050
produce the same bitcodes:
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1051
\begin{lemma}\label{rewriteBmkepsAux}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1052
	\hspace{0em}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1053
	\begin{itemize}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1054
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1055
			$r_1 \rightsquigarrow r_2 \implies 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1056
			(\bnullable \; r_1 \land \bnullable \; r_2 \implies \bmkeps \; r_1 = 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1057
			\bmkeps \; r_2)$ 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1058
		\item
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1059
			and
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1060
			$rs_ 1 \stackrel{s}{\rightsquigarrow} rs_2 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1061
			\implies (\bnullables \; rs_1 \land \bnullables \; rs_2 \implies 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1062
			\bmkepss \; rs_1 = \bmkepss \; rs2)$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1063
	\end{itemize}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1064
\end{lemma}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1065
\begin{proof}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1066
	By rule induction over the cases that lead to $r_1 \rightsquigarrow r_2$.
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1067
\end{proof}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1068
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1069
With lemma \ref{rewriteBmkepsAux} in place we are ready to prove its
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1070
many-step version: 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1071
\begin{lemma}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1072
	$\text{If} \;\; r \stackrel{*}{\rightsquigarrow} r' \;\; \text{and} \;\; \bnullable \; r, \;\;\; \text{then} \;\; \bmkeps \; r = \bmkeps \; r'$
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1073
\end{lemma}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1074
\begin{proof}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1075
	By rule induction of $\stackrel{*}{\rightsquigarrow} $. Lemma 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1076
	$\ref{rewritesBnullable}$ gives us both $r$ and $r'$ are nullable.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1077
	The lemma \ref{rewriteBmkepsAux} solves the inductive case.
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1078
\end{proof}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1079
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1080
\subsubsection{Property 2: $r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$}
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1081
Now we get to the key part of the proof, 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1082
which says that our simplification's helper functions 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1083
such as $\distinctBy$ and $\flts$ describe
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1084
reducts of $\stackrel{s*}{\rightsquigarrow}$ and 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1085
$\rightsquigarrow^* $.
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1086
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1087
The first lemma to prove is a more general version of 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1088
$rs_ 1 \rightsquigarrow^* \distinctBy \; rs_1 \; \phi$:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1089
\begin{lemma}
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1090
	$rs_1 @ rs_2 \stackrel{s*}{\rightsquigarrow} 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1091
	(rs_1 @ (\distinctBy \; rs_2 \; \; \rerases \;\; (\map\;\; \rerases \; \; rs_1)))$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1092
\end{lemma}
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1093
\noindent
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1094
It says that for a list made of two parts $rs_1 @ rs_2$, 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1095
one can throw away the duplicate
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1096
elements in $rs_2$, as well as those that have appeared in $rs_1$.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1097
\begin{proof}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1098
	By induction on $rs_2$, where $rs_1$ is allowed to be arbitrary.
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1099
\end{proof}
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1100
\noindent
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1101
Setting $rs_2$ to be empty,
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1102
we get the corollary
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1103
\begin{corollary}\label{dBPreserves}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1104
	$rs_1 \stackrel{s*}{\rightsquigarrow} \distinctBy \; rs_1 \; \phi$.
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1105
\end{corollary}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1106
\noindent
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1107
Similarly the flatten function $\flts$ describes a reduct of
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1108
$\stackrel{s*}{\rightsquigarrow}$ as well:
538
8016a2480704 intro and chap2
Chengsong
parents: 532
diff changeset
  1109
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1110
\begin{lemma}\label{fltsPreserves}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1111
	$rs \stackrel{s*}{\rightsquigarrow} \flts \; rs$
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1112
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1113
\begin{proof}
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1114
	By an induction on $rs$.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1115
\end{proof}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1116
\noindent
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1117
The function $\bsimpalts$ preserves rewritability:
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1118
\begin{lemma}\label{bsimpaltsPreserves}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1119
	$_{bs} \sum rs \stackrel{*}{\rightsquigarrow} \bsimpalts \; _{bs} \; rs$
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1120
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1121
\noindent
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1122
The simplification function
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1123
$\textit{bsimp}$ only transforms the regular expression  using steps specified by 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1124
$\rightsquigarrow^*$ and nothing else:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1125
\begin{lemma}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1126
	$r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1127
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1128
\begin{proof}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1129
	By an induction on $r$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1130
	The most involved case is the alternative, 
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1131
	where we use lemmas \ref{bsimpaltsPreserves},
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1132
	\ref{fltsPreserves} and \ref{dBPreserves} to do a series of rewriting:\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1133
	\begin{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1134
		\begin{tabular}{lcl}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1135
			$rs$ &  $\stackrel{s*}{\rightsquigarrow}$ & $ \map \; \textit{bsimp} \; rs$\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1136
			     &  $\stackrel{s*}{\rightsquigarrow}$ & $ \flts \; (\map \; \textit{bsimp} \; rs)$\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1137
			     &  $\stackrel{s*}{\rightsquigarrow}$ & $ \distinctBy \; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1138
			(\flts \; (\map \; \textit{bsimp}\; rs)) \; \rerases \; \phi$\\
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1139
		\end{tabular}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1140
	\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1141
	Using this we can derive the following rewrite sequence:\\
586
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1142
	\begin{center}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1143
		\begin{tabular}{lcl}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1144
			$r$ & $=$ & $_{bs}\sum rs$\\[1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1145
			    & $\rightsquigarrow^*$ & $\bsimpalts \; bs \; rs$ \\[1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1146
			    & $\rightsquigarrow^*$ & $\ldots$ \\ [1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1147
			    & $\rightsquigarrow^*$ & $\bsimpalts \; bs \; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1148
			    (\distinctBy \; (\flts \; (\map \; \textit{bsimp}\; rs)) 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1149
			    \; \rerases \; \phi)$\\[1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1150
			    %& $\rightsquigarrow^*$ & $ _{bs} \sum (\distinctBy \; 
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1151
				%(\flts \; (\map \; \textit{bsimp}\; rs)) \; \;
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1152
				%\rerases \; \;\phi) $\\[1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1153
			    & $\rightsquigarrow^*$ & $\textit{bsimp} \; r$\\[1.5ex]
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1154
		\end{tabular}
826af400b068 more chap4
Chengsong
parents: 585
diff changeset
  1155
	\end{center}	
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1156
\end{proof}
585
4969ef817d92 chap4 more
Chengsong
parents: 584
diff changeset
  1157
\subsubsection{Property 3: $r_1 \stackrel{*}{\rightsquigarrow}  r_2 \implies r_1 \backslash c \stackrel{*}{\rightsquigarrow} r_2 \backslash c$}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1158
The rewrite relation 
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1159
$\rightsquigarrow$ changes into $\stackrel{*}{\rightsquigarrow}$
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1160
after derivatives are taken on both sides:
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1161
\begin{lemma}\label{rewriteBder}
588
Chengsong
parents: 586
diff changeset
  1162
	\hspace{0em}
Chengsong
parents: 586
diff changeset
  1163
	\begin{itemize}
Chengsong
parents: 586
diff changeset
  1164
		\item
Chengsong
parents: 586
diff changeset
  1165
			If $r_1 \rightsquigarrow r_2$, then $r_1 \backslash c 
Chengsong
parents: 586
diff changeset
  1166
			\rightsquigarrow^*  r_2 \backslash c$ 
Chengsong
parents: 586
diff changeset
  1167
		\item	
Chengsong
parents: 586
diff changeset
  1168
			If $rs_1 \stackrel{s}{\rightsquigarrow} rs_2$, then $ 
Chengsong
parents: 586
diff changeset
  1169
			\map \; (\_\backslash c) \; rs_1 
Chengsong
parents: 586
diff changeset
  1170
			\stackrel{s*}{\rightsquigarrow} \map \; (\_ \backslash c) \; rs_2$
Chengsong
parents: 586
diff changeset
  1171
	\end{itemize}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1172
\end{lemma}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1173
\begin{proof}
588
Chengsong
parents: 586
diff changeset
  1174
	By induction on $\rightsquigarrow$ 
Chengsong
parents: 586
diff changeset
  1175
	and $\stackrel{s}{\rightsquigarrow}$, using a number of the previous lemmas.
Chengsong
parents: 586
diff changeset
  1176
\end{proof}
Chengsong
parents: 586
diff changeset
  1177
\noindent
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1178
Now we can prove property 3 as an immediate corollary:
588
Chengsong
parents: 586
diff changeset
  1179
\begin{corollary}\label{rewritesBder}
Chengsong
parents: 586
diff changeset
  1180
	$r_1 \rightsquigarrow^* r_2 \implies r_1 \backslash c \rightsquigarrow^*   
Chengsong
parents: 586
diff changeset
  1181
	r_2 \backslash c$
Chengsong
parents: 586
diff changeset
  1182
\end{corollary}
Chengsong
parents: 586
diff changeset
  1183
\begin{proof}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1184
	By rule induction of $\stackrel{*}{\rightsquigarrow} $ and   lemma \ref{rewriteBder}.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1185
\end{proof}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1186
\noindent
588
Chengsong
parents: 586
diff changeset
  1187
This can be extended and combined with $r \rightsquigarrow^* \textit{bsimp} \; r$
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1188
to obtain the correspondence between
588
Chengsong
parents: 586
diff changeset
  1189
$\blexer$ and $\blexersimp$'s intermediate
Chengsong
parents: 586
diff changeset
  1190
derivative regular expressions 
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1191
\begin{lemma}\label{bderBderssimp}
588
Chengsong
parents: 586
diff changeset
  1192
	$a \backslash s \rightsquigarrow^* \bderssimp{a}{s} $
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1193
\end{lemma}
588
Chengsong
parents: 586
diff changeset
  1194
\begin{proof}
Chengsong
parents: 586
diff changeset
  1195
	By an induction on $s$.
Chengsong
parents: 586
diff changeset
  1196
\end{proof}
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1197
\subsection{Main Theorem}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1198
Now with \ref{bderBderssimp} in place we are ready for the main theorem.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1199
\begin{theorem}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1200
	$\blexer \; r \; s = \blexersimp{r}{s}$
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1201
\end{theorem}
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1202
\noindent
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1203
\begin{proof}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1204
	We can rewrite in many steps from the original lexer's 
588
Chengsong
parents: 586
diff changeset
  1205
	derivative regular expressions to the 
Chengsong
parents: 586
diff changeset
  1206
	lexer with simplification applied (by lemma \ref{bderBderssimp}):
Chengsong
parents: 586
diff changeset
  1207
	\begin{center}
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1208
		$a \backslash s \rightsquigarrow^* \bderssimp{a}{s} $.
588
Chengsong
parents: 586
diff changeset
  1209
	\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1210
	We know that they generate the same bits, if the lexing result is a match:
588
Chengsong
parents: 586
diff changeset
  1211
	\begin{center}
Chengsong
parents: 586
diff changeset
  1212
		$\bnullable \; (a \backslash s) 
Chengsong
parents: 586
diff changeset
  1213
		\implies \bmkeps \; (a \backslash s) = \bmkeps \; (\bderssimp{a}{s})$
Chengsong
parents: 586
diff changeset
  1214
	\end{center}
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1215
	Now that they generate the same bits, we know they also give the same value after decoding.
588
Chengsong
parents: 586
diff changeset
  1216
	\begin{center}
Chengsong
parents: 586
diff changeset
  1217
		$\bnullable \; (a \backslash s) 
Chengsong
parents: 586
diff changeset
  1218
		\implies \decode \; r \; (\bmkeps \; (a \backslash s)) = 
Chengsong
parents: 586
diff changeset
  1219
		\decode \; r \; (\bmkeps \; (\bderssimp{a}{s}))$
Chengsong
parents: 586
diff changeset
  1220
	\end{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1221
	Which is required by our proof goal:
588
Chengsong
parents: 586
diff changeset
  1222
	\begin{center}
Chengsong
parents: 586
diff changeset
  1223
		$\blexer \; r \; s = \blexersimp \; r \; s$.
Chengsong
parents: 586
diff changeset
  1224
	\end{center}	
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1225
\end{proof}
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1226
\noindent
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1227
As a corollary,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1228
we can link this result with the lemma we proved earlier that 
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1229
\begin{center}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1230
	$(r, s) \rightarrow v \;\; \textit{iff}\;\; \blexer \; r \; s = \Some \;v$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1231
	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer\;
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1232
	r\;s = \None$.
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1233
\end{center}
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1234
and obtain the property that the bit-coded lexer with simplification is
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1235
indeed correctly generating a POSIX lexing result, if such a result exists.
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1236
\begin{corollary}
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1237
	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp \; r\; s = \Some \; v$\\
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1238
	$\nexists v. \; (r, s) \rightarrow v \;\; \textit{iff} \;\; \blexersimp\;
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1239
	r\;s = \None$.
576
3e1b699696b6 thesis chap5
Chengsong
parents: 543
diff changeset
  1240
\end{corollary}
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1241
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1242
\subsection{Comments on the Proof Techniques Used}
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1243
Straightforward as the proof may seem,
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1244
the efforts we spent obtaining it were far from trivial.
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1245
We initially attempted to re-use the argument 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1246
in \cref{flex_retrieve}. 
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1247
The problem is that both functions $\inj$ and $\retrieve$ require 
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1248
that the annotated regular expressions stay unsimplified, 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1249
so that one can 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1250
correctly compare $v_{i+1}$ and $r_i$  and $v_i$ 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1251
in diagram \ref{graph:inj}.
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1252
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1253
We also tried to prove 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1254
\begin{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1255
$\textit{bsimp} \;\; (\bderssimp{a}{s}) = 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1256
\textit{bsimp} \;\;  (a\backslash s)$,
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1257
\end{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1258
but this turns out to be not true.
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1259
A counterexample is
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1260
\[ a = [(_{Z}1+_{S}c)\cdot [bb \cdot (_{Z}1+_{S}c)]] \;\; 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1261
	\text{and} \;\; s = bb.
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1262
\]
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1263
\noindent
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1264
Then we would have 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1265
\begin{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1266
	$\textit{bsimp}\;\; ( a \backslash s )$ =
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1267
	$_{[]}(_{ZZ}\ONE +  _{ZS}c ) $
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1268
\end{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1269
\noindent
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1270
whereas 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1271
\begin{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1272
	$\textit{bsimp} \;\;( \bderssimp{a}{s} )$ =  
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1273
	$_{Z}(_{Z} \ONE + _{S} c)$.
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1274
\end{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1275
Unfortunately, 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1276
if we apply $\textit{bsimp}$ differently
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1277
we will always have this discrepancy. 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1278
This is due to 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1279
the $\map \; (\fuse\; bs) \; as$ operation 
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1280
happening at different locations in the regular expression.
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1281
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1282
The rewriting relation 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1283
$\rightsquigarrow^*$ 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1284
allows us to ignore this discrepancy
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1285
and view the expressions 
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1286
\begin{center}
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1287
	$_{[]}(_{ZZ}\ONE +  _{ZS}c ) $\\
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1288
	and\\
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1289
	$_{Z}(_{Z} \ONE + _{S} c)$
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1290
589
86e0203db2da chap4 finished
Chengsong
parents: 588
diff changeset
  1291
\end{center}
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1292
as equal because they were both re-written
639
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1293
from the same expression.
80cc6dc4c98b until chap 7
Chengsong
parents: 624
diff changeset
  1294
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1295
The simplification rewriting rules
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1296
given in \ref{rrewriteRules} are by no means
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1297
final,
640
bd1354127574 more proofreading done, last version before submission
Chengsong
parents: 639
diff changeset
  1298
one could come up with new rules
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1299
such as 
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1300
$\SEQ r_1 \cdot (\SEQ r_1 \cdot r_3) \rightarrow
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1301
\SEQs [r_1, r_2, r_3]$.
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1302
However this does not fit with the proof technique
600
fd068f39ac23 chap4 comments done
Chengsong
parents: 591
diff changeset
  1303
of our main theorem, but seem to not violate the POSIX
624
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1304
property.
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1305
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1306
Having established the correctness of our
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1307
$\blexersimp$, in the next chapter we shall prove that with our $\simp$ function,
8ffa28fce271 all comments incorporated!!+related work
Chengsong
parents: 601
diff changeset
  1308
for a given $r$, the derivative size is always
543
b2bea5968b89 thesis_thys
Chengsong
parents: 539
diff changeset
  1309
finitely bounded by a constant.