ChengsongTanPhdThesis/Chapters/Cubic.tex
author Chengsong
Mon, 07 Nov 2022 21:31:07 +0000
changeset 620 ae6010c14e49
parent 613 b0f0d884a547
child 621 17c7611fb0a9
permissions -rwxr-xr-x
chap6 almost done
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     1
% Chapter Template
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     2
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     3
\chapter{A Better Bound and Other Extensions} % Main chapter title
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     4
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     5
\label{Cubic} %In Chapter 5\ref{Chapter5} we discuss stronger simplifications to improve the finite bound
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     6
%in Chapter 4 to a polynomial one, and demonstrate how one can extend the
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
     7
%algorithm to include constructs such as bounded repetitions and negations.
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
     8
\lstset{style=myScalastyle}
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
     9
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    10
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    11
This chapter is a ``miscellaneous''
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    12
chapter which records various
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    13
extensions to our $\blexersimp$'s formalisations.\\
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    14
Firstly we present further improvements
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    15
made to our lexer algorithm $\blexersimp$.
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    16
We devise a stronger simplification algorithm,
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    17
called $\bsimpStrong$, which can prune away
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    18
similar components in two regular expressions at the same 
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    19
alternative level,
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    20
even if these regular expressions are not exactly the same.
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    21
We call the lexer that uses this stronger simplification function
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    22
$\blexerStrong$.
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    23
We conjecture that both
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    24
\begin{center}
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    25
	$\blexerStrong \;r \; s = \blexer\; r\;s$
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    26
\end{center}
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    27
and
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    28
\begin{center}
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    29
	$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    30
\end{center}
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    31
hold, but formalising
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    32
them is still work in progress.
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    33
We give reasons why the correctness and cubic size bound proofs
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    34
can be achieved,
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    35
by exploring the connection between the internal
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    36
data structure of our $\blexerStrong$ and
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    37
Animirov's partial derivatives.\\
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    38
Secondly, we extend our $\blexersimp$
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    39
to support bounded repetitions ($r^{\{n\}}$).
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    40
We update our formalisation of 
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    41
the correctness and finiteness properties to
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    42
include this new construct. With bounded repetitions
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    43
we are able to out-compete other verified lexers such as
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    44
Verbatim++ on regular expressions which involve a lot of
620
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
    45
repetitions. %We also present the idempotency property proof
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
    46
%of $\bsimp$, which leverages the idempotency proof of $\rsimp$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
    47
%This reinforces our claim that the fixpoint construction
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
    48
%originally required by Sulzmann and Lu can be removed in $\blexersimp$.
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    49
\\
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    50
Last but not least, we present our efforts and challenges we met
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    51
in further improving the algorithm by data
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    52
structures such as zippers.
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    53
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    54
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    55
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
    56
%----------------------------------------------------------------------------------------
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
    57
%	SECTION strongsimp
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
    58
%----------------------------------------------------------------------------------------
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
    59
\section{A Stronger Version of Simplification}
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
    60
%TODO: search for isabelle proofs of algorithms that check equivalence 
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
    61
In our bitcoded lexing algorithm, (sub)terms represent (sub)matches.
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    62
For example, the regular expression 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    63
\[
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    64
	aa \cdot a^*+ a \cdot a^* + aa\cdot a^*
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    65
\]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    66
contains three terms, 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    67
expressing three possibilities it will match future input.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    68
The first and the third terms are identical, which means we can eliminate
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    69
the latter as we know it will not be picked up by $\bmkeps$.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    70
In $\bsimps$, the $\distinctBy$ function takes care of this.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    71
The criteria $\distinctBy$ uses for removing a duplicate
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    72
$a_2$ in the list
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    73
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    74
	$rs_a@[a_1]@rs_b@[a_2]@rs_c$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    75
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    76
is that 
533
Chengsong
parents: 532
diff changeset
    77
\begin{center}
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    78
	$\rerase{a_1} = \rerase{a_2}$.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    79
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    80
It can be characterised as the $LD$ 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    81
rewrite rule in \ref{rrewriteRules}.\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    82
The problem , however, is that identical components
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    83
in two slightly different regular expressions cannot be removed:
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    84
\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    85
\[
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    86
	(a+b+d) \cdot r_1 + (a+c+e) \cdot r_1 \stackrel{?}{\rightsquigarrow} (a+b+d) \cdot r_1 + (c+e) \cdot r_1
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    87
\]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    88
\caption{Desired simplification, but not done in $\blexersimp$}\label{partialDedup}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    89
\end{figure}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    90
\noindent
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    91
A simplification like this actually
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    92
cannot be omitted,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    93
as without it the size could blow up even with our $\simp$ function:
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    94
\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    95
\centering
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    96
\begin{tikzpicture}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    97
\begin{axis}[
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    98
    %xlabel={$n$},
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
    99
    myplotstyle,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   100
    xlabel={input length},
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   101
    ylabel={size},
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   102
    ]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   103
\addplot[blue,mark=*, mark options={fill=white}] table {bsimpExponential.data};
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   104
\end{axis}
533
Chengsong
parents: 532
diff changeset
   105
\end{tikzpicture}
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   106
\caption{Runtime of $\blexersimp$ for matching 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   107
	$\protect((a^* + (aa)^* + \ldots + (aaaaa)^* )^*)^*$ 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   108
	with strings 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   109
	of the form $\protect\underbrace{aa..a}_{n}$.}\label{blexerExp}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   110
\end{figure}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   111
\noindent
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   112
We would like to apply the rewriting at some stage
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   113
\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   114
\[
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   115
	(a+b+d) \cdot r_1  \longrightarrow a \cdot r_1 + b \cdot r_1 + d \cdot r_1
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   116
\]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   117
\caption{Desired simplification, but not done in $\blexersimp$}\label{desiredSimp}
533
Chengsong
parents: 532
diff changeset
   118
\end{figure}
Chengsong
parents: 532
diff changeset
   119
\noindent
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   120
in our $\simp$ function,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   121
so that it makes the simplification in \ref{partialDedup} possible.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   122
For example,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   123
can a function like the below one be used?
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   124
%TODO: simp' that spills things
533
Chengsong
parents: 532
diff changeset
   125
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   126
Unfortunately,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   127
if we introduce them in our
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   128
setting we would lose the POSIX property of our calculated values. 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   129
For example given the regular expression 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   130
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   131
	$(a + ab)(bc + c)$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   132
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   133
and the string
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   134
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   135
	$ab$,
533
Chengsong
parents: 532
diff changeset
   136
\end{center}
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   137
then our algorithm generates the following
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   138
correct POSIX value
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   139
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   140
	$\Seq \; (\Right \; ab) \; (\Right \; c)$.
533
Chengsong
parents: 532
diff changeset
   141
\end{center}
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   142
Essentially it matches the string with the longer Right-alternative
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   143
in the first sequence (and
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   144
then the 'rest' with the character regular expression $c$ from the second sequence).
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   145
If we add the simplification above, then we obtain the following value
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   146
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   147
	$\Left \; (\Seq \; a \; (\Left \; bc))$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   148
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   149
where the $\Left$-alternatives get priority. 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   150
However this violates the POSIX rules.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   151
The reason for getting this undesired value
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   152
is that the new rule splits this regular expression up into 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   153
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   154
	$a\cdot(b c + c) + ab \cdot (bc + c)$,
533
Chengsong
parents: 532
diff changeset
   155
\end{center}
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   156
which becomes a regular expression with a 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   157
totally different structure--the original
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   158
was a sequence, and now it becomes an alternative.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   159
With an alternative the maximum munch rule no longer works.\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   160
A method to reconcile this is to do the 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   161
transformation in \ref{desiredSimp} ``non-invasively'',
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   162
meaning that we traverse the list of regular expressions
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   163
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   164
	$rs_a@[a]@rs_c$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   165
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   166
in the alternative
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   167
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   168
	$\sum ( rs_a@[a]@rs_c)$
533
Chengsong
parents: 532
diff changeset
   169
\end{center}
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   170
using  a function similar to $\distinctBy$,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   171
but this time 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   172
we allow a more general list rewrite:
592
Chengsong
parents: 591
diff changeset
   173
\begin{mathpar}\label{cubicRule}
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   174
		\inferrule{\vspace{0mm} }{rs_a@[a]@rs_c 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   175
			\stackrel{s}{\rightsquigarrow }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   176
		rs_a@[\textit{prune} \; a \; rs_a]@rs_c }
592
Chengsong
parents: 591
diff changeset
   177
\end{mathpar}
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   178
%L \; a_1' = L \; a_1 \setminus (\cup_{a \in rs_a} L \; a)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   179
where $\textit{prune} \;a \; acc$ traverses $a$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   180
without altering the structure of $a$, removing components in $a$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   181
that have appeared in the accumulator $acc$.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   182
For example
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   183
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   184
	$\textit{prune} \;\;\; (r_a+r_f+r_g+r_h)r_d \;\; \; [(r_a+r_b+r_c)r_d, (r_e+r_f)r_d] $
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   185
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   186
should be equal to 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   187
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   188
	$(r_g+r_h)r_d$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   189
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   190
because $r_gr_d$ and 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   191
$r_hr_d$ are the only terms
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   192
that have not appeared in the accumulator list 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   193
\begin{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   194
$[(r_a+r_b+r_c)r_d, (r_e+r_f)r_d]$.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   195
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   196
We implemented 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   197
function $\textit{prune}$ in Scala,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   198
and incorporated into our lexer,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   199
by replacing the $\simp$ function 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   200
with a stronger version called $\bsimpStrong$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   201
that prunes regular expressions.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   202
We call this lexer $\blexerStrong$.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   203
$\blexerStrong$ is able to drastically reduce the
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   204
internal data structure size which could
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   205
trigger exponential behaviours in
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   206
$\blexersimp$.
590
988e92a70704 more chap5 and chap6 bsimp_idem
Chengsong
parents: 538
diff changeset
   207
\begin{figure}[H]
533
Chengsong
parents: 532
diff changeset
   208
\centering
Chengsong
parents: 532
diff changeset
   209
\begin{tabular}{@{}c@{\hspace{0mm}}c@{\hspace{0mm}}c@{}}
Chengsong
parents: 532
diff changeset
   210
\begin{tikzpicture}
Chengsong
parents: 532
diff changeset
   211
\begin{axis}[
535
ce91c29d2885 fixed plotting error 6.1
Chengsong
parents: 533
diff changeset
   212
    %xlabel={$n$},
ce91c29d2885 fixed plotting error 6.1
Chengsong
parents: 533
diff changeset
   213
    myplotstyle,
ce91c29d2885 fixed plotting error 6.1
Chengsong
parents: 533
diff changeset
   214
    xlabel={input length},
533
Chengsong
parents: 532
diff changeset
   215
    ylabel={size},
535
ce91c29d2885 fixed plotting error 6.1
Chengsong
parents: 533
diff changeset
   216
    ]
533
Chengsong
parents: 532
diff changeset
   217
\addplot[red,mark=*, mark options={fill=white}] table {strongSimpCurve.data};
Chengsong
parents: 532
diff changeset
   218
\end{axis}
Chengsong
parents: 532
diff changeset
   219
\end{tikzpicture}
Chengsong
parents: 532
diff changeset
   220
  &
Chengsong
parents: 532
diff changeset
   221
\begin{tikzpicture}
Chengsong
parents: 532
diff changeset
   222
\begin{axis}[
535
ce91c29d2885 fixed plotting error 6.1
Chengsong
parents: 533
diff changeset
   223
    %xlabel={$n$},
ce91c29d2885 fixed plotting error 6.1
Chengsong
parents: 533
diff changeset
   224
    myplotstyle,
ce91c29d2885 fixed plotting error 6.1
Chengsong
parents: 533
diff changeset
   225
    xlabel={input length},
533
Chengsong
parents: 532
diff changeset
   226
    ylabel={size},
535
ce91c29d2885 fixed plotting error 6.1
Chengsong
parents: 533
diff changeset
   227
    ]
533
Chengsong
parents: 532
diff changeset
   228
\addplot[blue,mark=*, mark options={fill=white}] table {bsimpExponential.data};
Chengsong
parents: 532
diff changeset
   229
\end{axis}
Chengsong
parents: 532
diff changeset
   230
\end{tikzpicture}\\
535
ce91c29d2885 fixed plotting error 6.1
Chengsong
parents: 533
diff changeset
   231
\multicolumn{2}{l}{Graphs: Runtime for matching $((a^* + (aa)^* + \ldots + (aaaaa)^* )^*)^*$ with strings 
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   232
	   of the form $\underbrace{aa..a}_{n}$.}
533
Chengsong
parents: 532
diff changeset
   233
\end{tabular}    
Chengsong
parents: 532
diff changeset
   234
\caption{aaaaaStarStar} \label{fig:aaaaaStarStar}
Chengsong
parents: 532
diff changeset
   235
\end{figure}
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   236
\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   237
\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   238
    def atMostEmpty(r: Rexp) : Boolean = r match {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   239
      case ZERO => true
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   240
      case ONE => true
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   241
      case STAR(r) => atMostEmpty(r)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   242
      case SEQ(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   243
      case ALTS(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   244
      case CHAR(_) => false
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   245
    }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   246
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   247
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   248
    def isOne(r: Rexp) : Boolean = r match {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   249
      case ONE => true
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   250
      case SEQ(r1, r2) => isOne(r1) && isOne(r2)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   251
      case ALTS(r1, r2) => (isOne(r1) || isOne(r2)) && (atMostEmpty(r1) && atMostEmpty(r2))//rs.forall(atMostEmpty) && rs.exists(isOne)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   252
      case STAR(r0) => atMostEmpty(r0)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   253
      case CHAR(c) => false
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   254
      case ZERO => false
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   255
    }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   256
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   257
    //r = r' ~ tail' : If tail' matches tail => returns r'
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   258
    def removeSeqTail(r: Rexp, tail: Rexp) : Rexp = r match {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   259
      case SEQ(r1, r2) => 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   260
        if(r2 == tail) 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   261
          r1
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   262
        else
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   263
          ZERO
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   264
      case r => ZERO
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   265
    }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   266
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   267
    def prune(r: ARexp, acc: Set[Rexp]) : ARexp = r match{
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   268
      case AALTS(bs, rs) => rs.map(r => prune(r, acc)).filter(_ != ZERO) match
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   269
      {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   270
        //all components have been removed, meaning this is effectively a duplicate
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   271
        //flats will take care of removing this AZERO 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   272
        case Nil => AZERO
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   273
        case r::Nil => fuse(bs, r)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   274
        case rs1 => AALTS(bs, rs1)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   275
      }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   276
      case ASEQ(bs, r1, r2) => 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   277
        //remove the r2 in (ra + rb)r2 to identify the duplicate contents of r1
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   278
        prune(r1, acc.map(r => removeSeqTail(r, erase(r2)))) match {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   279
          //after pruning, returns 0
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   280
          case AZERO => AZERO
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   281
          //after pruning, got r1'.r2, where r1' is equal to 1
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   282
          case r1p if(isOne(erase(r1p))) => fuse(bs ++ mkepsBC(r1p), r2)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   283
          //assemble the pruned head r1p with r2
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   284
          case r1p => ASEQ(bs, r1p, r2)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   285
        }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   286
        //this does the duplicate component removal task
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   287
      case r => if(acc(erase(r))) AZERO else r
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   288
    }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   289
\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   290
\caption{pruning function together with its helper functions}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   291
\end{figure}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   292
\noindent
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   293
The benefits of using 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   294
$\textit{prune}$ such as refining the finiteness bound
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   295
to a cubic bound has not been formalised yet.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   296
Therefore we choose to use Scala code rather than an Isabelle-style formal 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   297
definition like we did for $\simp$, as the definitions might change
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   298
to suit proof needs.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   299
In the rest of the chapter we will use this convention consistently.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   300
\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   301
\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   302
  def distinctWith(rs: List[ARexp], 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   303
    pruneFunction: (ARexp, Set[Rexp]) => ARexp, 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   304
    acc: Set[Rexp] = Set()) : List[ARexp] = 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   305
      rs match{
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   306
        case Nil => Nil
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   307
        case r :: rs => 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   308
          if(acc(erase(r)))
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   309
            distinctWith(rs, pruneFunction, acc)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   310
          else {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   311
            val pruned_r = pruneFunction(r, acc)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   312
            pruned_r :: 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   313
            distinctWith(rs, 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   314
              pruneFunction, 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   315
              turnIntoTerms(erase(pruned_r)) ++: acc
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   316
              )
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   317
          }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   318
      }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   319
\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   320
\caption{A Stronger Version of $\textit{distinctBy}$}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   321
\end{figure}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   322
\noindent
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   323
The function $\textit{prune}$ is used in $\distinctWith$.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   324
$\distinctWith$ is a stronger version of $\distinctBy$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   325
which not only removes duplicates as $\distinctBy$ would
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   326
do, but also uses the $\textit{pruneFunction}$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   327
argument to prune away verbose components in a regular expression.\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   328
\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   329
\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   330
   //a stronger version of simp
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   331
    def bsimpStrong(r: ARexp): ARexp =
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   332
    {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   333
      r match {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   334
        case ASEQ(bs1, r1, r2) => (bsimpStrong(r1), bsimpStrong(r2)) match {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   335
          //normal clauses same as simp
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   336
        case (AZERO, _) => AZERO
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   337
        case (_, AZERO) => AZERO
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   338
        case (AONE(bs2), r2s) => fuse(bs1 ++ bs2, r2s)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   339
        //bs2 can be discarded
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   340
        case (r1s, AONE(bs2)) => fuse(bs1, r1s) //assert bs2 == Nil
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   341
        case (r1s, r2s) => ASEQ(bs1, r1s, r2s)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   342
        }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   343
        case AALTS(bs1, rs) => {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   344
          //distinctBy(flat_res, erase)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   345
          distinctWith(flats(rs.map(bsimpStrong(_))), prune) match {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   346
            case Nil => AZERO
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   347
            case s :: Nil => fuse(bs1, s)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   348
            case rs => AALTS(bs1, rs)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   349
          }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   350
        }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   351
        //stars that can be treated as 1
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   352
        case ASTAR(bs, r0) if(atMostEmpty(erase(r0))) => AONE(bs)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   353
        case r => r
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   354
      }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   355
    }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   356
\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   357
\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   358
\end{figure}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   359
\noindent
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   360
$\distinctWith$, is in turn used in $\bsimpStrong$:
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   361
\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   362
\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   363
      //Conjecture: [| bdersStrong(s, r) |] = O([| r |]^3)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   364
      def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   365
        case Nil => r
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   366
        case c::s => bdersStrong(s, bsimpStrong(bder(c, r)))
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   367
      }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   368
\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   369
\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   370
\end{figure}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   371
\noindent
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   372
We conjecture that the above Scala function $\bdersStrongs$,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   373
written $\bdersStrong{\_}{\_}$ as an infix notation, 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   374
satisfies the following property:
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   375
\begin{conjecture}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   376
	$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   377
\end{conjecture}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   378
The stronger version of $\blexersimp$'s
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   379
code in Scala looks like: 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   380
\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   381
\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   382
      def strongBlexer(r: Rexp, s: String) : Option[Val] = {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   383
        Try(Some(decode(r, strong_blex_simp(internalise(r), s.toList)))).getOrElse(None)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   384
      }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   385
      def strong_blex_simp(r: ARexp, s: List[Char]) : Bits = s match {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   386
        case Nil => {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   387
          if (bnullable(r)) {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   388
            mkepsBC(r)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   389
          }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   390
          else
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   391
            throw new Exception("Not matched")
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   392
        }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   393
        case c::cs => {
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   394
          strong_blex_simp(strongBsimp(bder(c, r)), cs)
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   395
        }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   396
      }
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   397
\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   398
\end{figure}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   399
\noindent
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   400
We would like to preserve the correctness like the one 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   401
we had for $\blexersimp$:
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   402
\begin{conjecture}\label{cubicConjecture}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   403
	$\blexerStrong \;r \; s = \blexer\; r\;s$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   404
\end{conjecture}
592
Chengsong
parents: 591
diff changeset
   405
\noindent
Chengsong
parents: 591
diff changeset
   406
We introduce the new rule \ref{cubicRule}
Chengsong
parents: 591
diff changeset
   407
and augment our proofs in chapter \ref{Bitcoded2}.
Chengsong
parents: 591
diff changeset
   408
The idea is to maintain the properties like
Chengsong
parents: 591
diff changeset
   409
$r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$ etc.
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   410
592
Chengsong
parents: 591
diff changeset
   411
In the next section,
Chengsong
parents: 591
diff changeset
   412
we will describe why we 
Chengsong
parents: 591
diff changeset
   413
believe a cubic bound can be achieved.
Chengsong
parents: 591
diff changeset
   414
We give an introduction to the
Chengsong
parents: 591
diff changeset
   415
partial derivatives,
Chengsong
parents: 591
diff changeset
   416
which was invented by Antimirov \cite{Antimirov95},
Chengsong
parents: 591
diff changeset
   417
and then link it with the result of the function
Chengsong
parents: 591
diff changeset
   418
$\bdersStrongs$.
Chengsong
parents: 591
diff changeset
   419
 
Chengsong
parents: 591
diff changeset
   420
\section{Antimirov's partial derivatives}
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   421
This suggests a link towrads "partial derivatives"
592
Chengsong
parents: 591
diff changeset
   422
 introduced  The idea behind Antimirov's partial derivatives
591
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   423
is to do derivatives in a similar way as suggested by Brzozowski, 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   424
but maintain a set of regular expressions instead of a single one:
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   425
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   426
%TODO: antimirov proposition 3.1, needs completion
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   427
 \begin{center}  
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   428
 \begin{tabular}{ccc}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   429
 $\partial_x(a+b)$ & $=$ & $\partial_x(a) \cup \partial_x(b)$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   430
$\partial_x(\ONE)$ & $=$ & $\phi$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   431
\end{tabular}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   432
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   433
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   434
Rather than joining the calculated derivatives $\partial_x a$ and $\partial_x b$ together
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   435
using the alternatives constructor, Antimirov cleverly chose to put them into
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   436
a set instead.  This breaks the terms in a derivative regular expression up, 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   437
allowing us to understand what are the "atomic" components of it.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   438
For example, To compute what regular expression $x^*(xx + y)^*$'s 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   439
derivative against $x$ is made of, one can do a partial derivative
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   440
of it and get two singleton sets $\{x^* \cdot (xx + y)^*\}$ and $\{x \cdot (xx + y) ^* \}$
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   441
from $\partial_x(x^*) \cdot (xx + y) ^*$ and $\partial_x((xx + y)^*)$.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   442
To get all the "atomic" components of a regular expression's possible derivatives,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   443
there is a procedure Antimirov called $\textit{lf}$, short for "linear forms", that takes
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   444
whatever character is available at the head of the string inside the language of a
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   445
regular expression, and gives back the character and the derivative regular expression
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   446
as a pair (which he called "monomial"):
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   447
 \begin{center}  
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   448
 \begin{tabular}{ccc}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   449
 $\lf(\ONE)$ & $=$ & $\phi$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   450
$\lf(c)$ & $=$ & $\{(c, \ONE) \}$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   451
 $\lf(a+b)$ & $=$ & $\lf(a) \cup \lf(b)$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   452
 $\lf(r^*)$ & $=$ & $\lf(r) \bigodot \lf(r^*)$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   453
\end{tabular}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   454
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   455
%TODO: completion
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   456
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   457
There is a slight difference in the last three clauses compared
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   458
with $\partial$: instead of a dot operator $ \textit{rset} \cdot r$ that attaches the regular 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   459
expression $r$ with every element inside $\textit{rset}$ to create a set of 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   460
sequence derivatives, it uses the "circle dot" operator $\bigodot$ which operates 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   461
on a set of monomials (which Antimirov called "linear form") and a regular 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   462
expression, and returns a linear form:
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   463
 \begin{center}  
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   464
 \begin{tabular}{ccc}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   465
 $l \bigodot (\ZERO)$ & $=$ & $\phi$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   466
 $l \bigodot (\ONE)$ & $=$ & $l$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   467
 $\phi \bigodot t$ & $=$ & $\phi$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   468
 $\{ (x, \ZERO) \} \bigodot t$ & $=$ & $\{(x,\ZERO) \}$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   469
 $\{ (x, \ONE) \} \bigodot t$ & $=$ & $\{(x,t) \}$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   470
  $\{ (x, p) \} \bigodot t$ & $=$ & $\{(x,p\cdot t) \}$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   471
 $\lf(a+b)$ & $=$ & $\lf(a) \cup \lf(b)$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   472
 $\lf(r^*)$ & $=$ & $\lf(r) \cdot \lf(r^*)$\\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   473
\end{tabular}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   474
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   475
%TODO: completion
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   476
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   477
 Some degree of simplification is applied when doing $\bigodot$, for example,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   478
 $l \bigodot (\ZERO) = \phi$ corresponds to $r \cdot \ZERO \rightsquigarrow \ZERO$,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   479
 and  $l \bigodot (\ONE) = l$ to $l \cdot \ONE \rightsquigarrow l$, and
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   480
  $\{ (x, \ZERO) \} \bigodot t = \{(x,\ZERO) \}$ to $\ZERO \cdot x \rightsquigarrow \ZERO$,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   481
  and so on.
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   482
  
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   483
  With the function $\lf$ one can compute all possible partial derivatives $\partial_{UNIV}(r)$ of a regular expression $r$ with 
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   484
  an iterative procedure:
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   485
   \begin{center}  
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   486
 \begin{tabular}{llll}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   487
$\textit{while}$ & $(\Delta_i \neq \phi)$  &                &          \\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   488
 		       &  $\Delta_{i+1}$           &        $ =$ & $\lf(\Delta_i) - \PD_i$ \\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   489
		       &  $\PD_{i+1}$              &         $ =$ & $\Delta_{i+1} \cup \PD_i$ \\
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   490
$\partial_{UNIV}(r)$ & $=$ & $\PD$ &		     
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   491
\end{tabular}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   492
\end{center}
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   493
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   494
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   495
 $(r_1 + r_2) \cdot r_3 \longrightarrow (r_1 \cdot r_3) + (r_2 \cdot r_3)$,
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   496
b2d0de6aee18 more polishing integrated comments chap2
Chengsong
parents: 590
diff changeset
   497
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
   498
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
   499
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
   500
%----------------------------------------------------------------------------------------
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
   501
%	SECTION 2
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
   502
%----------------------------------------------------------------------------------------
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
   503
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
   504
\section{Bounded Repetitions}
612
Chengsong
parents: 596
diff changeset
   505
We have promised in chapter \ref{Introduction}
Chengsong
parents: 596
diff changeset
   506
that our lexing algorithm can potentially be extended
Chengsong
parents: 596
diff changeset
   507
to handle bounded repetitions
Chengsong
parents: 596
diff changeset
   508
in natural and elegant ways.
Chengsong
parents: 596
diff changeset
   509
Now we fulfill our promise by adding support for 
Chengsong
parents: 596
diff changeset
   510
the ``exactly-$n$-times'' bounded regular expression $r^{\{n\}}$.
Chengsong
parents: 596
diff changeset
   511
We add clauses in our derivatives-based lexing algorithms (with simplifications)
Chengsong
parents: 596
diff changeset
   512
introduced in chapter \ref{Bitcoded2}.
Chengsong
parents: 596
diff changeset
   513
Chengsong
parents: 596
diff changeset
   514
\subsection{Augmented Definitions}
Chengsong
parents: 596
diff changeset
   515
There are a number of definitions that needs to be augmented.
Chengsong
parents: 596
diff changeset
   516
The most notable one would be the POSIX rules for $r^{\{n\}}$:
Chengsong
parents: 596
diff changeset
   517
\begin{mathpar}
Chengsong
parents: 596
diff changeset
   518
	\inferrule{v \in vs}{\textit{Stars} place holder}
Chengsong
parents: 596
diff changeset
   519
\end{mathpar}
Chengsong
parents: 596
diff changeset
   520
Chengsong
parents: 596
diff changeset
   521
Chengsong
parents: 596
diff changeset
   522
\subsection{Proofs for the Augmented Lexing Algorithm}
Chengsong
parents: 596
diff changeset
   523
We need to maintain two proofs with the additional $r^{\{n\}}$
Chengsong
parents: 596
diff changeset
   524
construct: the 
Chengsong
parents: 596
diff changeset
   525
correctness proof in chapter \ref{Bitcoded2},
Chengsong
parents: 596
diff changeset
   526
and the finiteness proof in chapter \ref{Finite}.
Chengsong
parents: 596
diff changeset
   527
Chengsong
parents: 596
diff changeset
   528
\subsubsection{Correctness Proof Augmentation}
Chengsong
parents: 596
diff changeset
   529
The correctness of $\textit{lexer}$ and $\textit{blexer}$ with bounded repetitions
Chengsong
parents: 596
diff changeset
   530
have been proven by Ausaf and Urban\cite{AusafDyckhoffUrban2016}.
Chengsong
parents: 596
diff changeset
   531
As they have commented, once the definitions are in place,
Chengsong
parents: 596
diff changeset
   532
the proofs given for the basic regular expressions will extend to
Chengsong
parents: 596
diff changeset
   533
bounded regular expressions, and there are no ``surprises''.
Chengsong
parents: 596
diff changeset
   534
We confirm this point because the correctness theorem would also
Chengsong
parents: 596
diff changeset
   535
extend without surprise to $\blexersimp$.
Chengsong
parents: 596
diff changeset
   536
The rewrite rules such as $\rightsquigarrow$, $\stackrel{s}{\rightsquigarrow}$ and so on
Chengsong
parents: 596
diff changeset
   537
do not need to be changed,
Chengsong
parents: 596
diff changeset
   538
and only a few lemmas such as lemma \ref{fltsPreserves} need to be adjusted to 
Chengsong
parents: 596
diff changeset
   539
add one more line which can be solved by sledgehammer 
Chengsong
parents: 596
diff changeset
   540
to solve the $r^{\{n\}}$ inductive case.
Chengsong
parents: 596
diff changeset
   541
Chengsong
parents: 596
diff changeset
   542
Chengsong
parents: 596
diff changeset
   543
\subsubsection{Finiteness Proof Augmentation}
620
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   544
The bounded repetitions are
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   545
very similar to stars, and therefore the treatment
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   546
is similar, with minor changes to handle some slight complications
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   547
when the counter reaches 0.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   548
The exponential growth is similar:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   549
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   550
	\begin{tabular}{ll}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   551
		$r^{\{n\}} $ & $\longrightarrow_{\backslash c}$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   552
		$(r\backslash c)  \cdot  
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   553
		r^{\{n - 1\}}*$ & $\longrightarrow_{\backslash c'}$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   554
		\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   555
		$r \backslash cc'  \cdot r^{\{n - 2\}}* + 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   556
		r \backslash c' \cdot r^{\{n - 1\}}*$ &
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   557
		$\longrightarrow_{\backslash c''}$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   558
		\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   559
		$(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* + 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   560
		r \backslash c''\cdot r^{\{n-1\}}) + 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   561
		(r \backslash c'c'' \cdot r^{\{n-2\}}* + 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   562
		r \backslash c'' \cdot r^{\{n-1\}}*)$ & 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   563
		$\longrightarrow_{\backslash c'''}$ \\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   564
		\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   565
		$\ldots$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   566
	\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   567
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   568
Again, we assume that $r\backslash c$, $r \backslash cc'$ and so on
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   569
are all nullable.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   570
The flattened list of terms for $r^{\{n\}} \backslash_{rs} s$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   571
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   572
	$[r_1 \backslash cc'c'' \cdot r^{\{n-3\}}*,\;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   573
	r \backslash c''\cdot r^{\{n-1\}}, \; 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   574
	r \backslash c'c'' \cdot r^{\{n-2\}}*, \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   575
	r \backslash c'' \cdot r^{\{n-1\}}*,\; \ldots ]$  
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   576
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   577
that comes from 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   578
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   579
		$(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* + 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   580
		r \backslash c''\cdot r^{\{n-1\}}) + 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   581
		(r \backslash c'c'' \cdot r^{\{n-2\}}* + 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   582
		r \backslash c'' \cdot r^{\{n-1\}}*)+ \ldots$ 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   583
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   584
are made of sequences with different tails, where the counters
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   585
might differ.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   586
The observation for maintaining the bound is that
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   587
these counters never exceed $n$, the original
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   588
counter. With the number of counters staying finite,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   589
$\rDistinct$ will deduplicate and keep the list finite.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   590
We introduce this idea as a lemma once we describe all
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   591
the necessary helper functions.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   592
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   593
Similar to the star case, we want
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   594
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   595
	$\rderssimp{r^{\{n\}}}{s} = \rsimp{\sum rs}$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   596
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   597
where $rs$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   598
shall be in the form of 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   599
$\map \; f \; Ss$, where $f$ is a function and
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   600
$Ss$ a list of objects to act on.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   601
For star, the object's datatype is string.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   602
The list of strings $Ss$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   603
is generated using functions 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   604
$\starupdate$ and $\starupdates$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   605
The function that takes a string and returns a regular expression
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   606
is the anonymous function $
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   607
(\lambda s'. \; r\backslash s' \cdot r^{\{m\}})$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   608
In the NTIMES setting,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   609
the $\starupdate$ and $\starupdates$ functions are replaced by 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   610
$\textit{nupdate}$ and $\textit{nupdates}$:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   611
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   612
	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   613
		$\nupdate \; c \; r \; [] $ & $\dn$ & $[]$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   614
		$\nupdate \; c \; r \; 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   615
		(\Some \; (s, \; n + 1) \; :: \; Ss)$ & $\dn$ & %\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   616
						     $\textit{if} \; 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   617
						     (\rnullable \; (r \backslash_{rs} s))$ \\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   618
						     & & $\;\;\textit{then} 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   619
						     \;\; \Some \; (s @ [c], n + 1) :: \Some \; ([c], n) :: (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   620
						     \nupdate \; c \; r \; Ss)$ \\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   621
						     & & $\textit{else} \;\; \Some \; (s @ [c], n+1) :: (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   622
						     \nupdate \; c \; r \; Ss)$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   623
		$\nupdate \; c \; r \; (\textit{None} :: Ss)$ & $\dn$ & 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   624
		$(\None :: \nupdate  \; c \; r \; Ss)$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   625
							      & & \\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   626
	%\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   627
%\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   628
%\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   629
	%\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   630
		$\nupdates \; [] \; r \; Ss$ & $\dn$ & $Ss$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   631
		$\nupdates \; (c :: cs) \; r \; Ss$ &  $\dn$ &  $\nupdates \; cs \; r \; (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   632
		\nupdate \; c \; r \; Ss)$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   633
	\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   634
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   635
\noindent
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   636
which take into account when a subterm
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   637
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   638
	$r \backslash_s s \cdot r^{\{n\}}$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   639
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   640
counter $n$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   641
is 0, and therefore expands to 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   642
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   643
$r \backslash_s (s@[c]) \cdot r^{\{n\}} \;+
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   644
\; \ZERO$ 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   645
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   646
after taking a derivative.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   647
The object now has type 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   648
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   649
$\textit{option} \;(\textit{string}, \textit{nat})$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   650
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   651
and therefore the function for converting such an option into
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   652
a regular expression term is called $\opterm$:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   653
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   654
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   655
	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   656
	$\opterm \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   657
				 & & $\;\;\Some \; (s, n) \Rightarrow 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   658
				 (r\backslash_{rs} s)\cdot r^{\{n\}}$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   659
				 & & $\;\;\None  \Rightarrow 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   660
				 \ZERO$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   661
	\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   662
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   663
\noindent
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   664
Put together, the list $\map \; f \; Ss$ is instantiated as
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   665
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   666
	$\map \; (\opterm \; r) \; (\nupdates \; s \; r \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   667
	[\Some \; ([c], n)])$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   668
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   669
For the closed form to be bounded, we would like
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   670
simplification to be applied to each term in the list.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   671
Therefore we introduce some variants of $\opterm$,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   672
which help conveniently express the rewriting steps 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   673
needed in the closed form proof.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   674
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   675
	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   676
	$\optermOsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   677
				 & & $\;\;\Some \; (s, n) \Rightarrow 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   678
				 \textit{rsimp} \; ((r\backslash_{rs} s)\cdot r^{\{n\}})$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   679
				 & & $\;\;\None  \Rightarrow 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   680
				 \ZERO$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   681
				 \\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   682
	$\optermosimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   683
				 & & $\;\;\Some \; (s, n) \Rightarrow 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   684
				 (\textit{rsimp} \; (r\backslash_{rs} s)) 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   685
				 \cdot r^{\{n\}}$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   686
				 & & $\;\;\None  \Rightarrow 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   687
				 \ZERO$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   688
				 \\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   689
	$\optermsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   690
				 & & $\;\;\Some \; (s, n) \Rightarrow 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   691
				 (r\backslash_{rsimps} s)\cdot r^{\{n\}}$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   692
				 & & $\;\;\None  \Rightarrow 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   693
				 \ZERO$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   694
	\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   695
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   696
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   697
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   698
For a list of 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   699
$\textit{option} \;(\textit{string}, \textit{nat})$ elements,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   700
we define the highest power for it recursively:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   701
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   702
	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   703
		$\hpa \; [] \; n $ & $\dn$ & $n$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   704
		$\hpa \; (\None :: os) \; n $ &  $\dn$ &  $\hpa \; os \; n$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   705
		$\hpa \; (\Some \; (s, n) :: os) \; m$ & $\dn$ & 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   706
		$\hpa \;os \; (\textit{max} \; n\; m)$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   707
		\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   708
		$\hpower \; rs $ & $\dn$ & $\hpa \; rs \; 0$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   709
	\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   710
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   711
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   712
Now the intuition that an NTIMES regular expression's power
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   713
does not increase can be easily expressed as
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   714
\begin{lemma}\label{nupdatesMono2}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   715
	$\hpower \; (\nupdates \;s \; r \; [\Some \; ([c], n)]) \leq n$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   716
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   717
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   718
	Note that the power is non-increasing after a $\nupdate$ application:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   719
	\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   720
		$\hpa \;\; (\nupdate \; c \; r \; Ss)\;\; m \leq 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   721
		 \hpa\; \; Ss \; m$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   722
	 \end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   723
	 This is also the case for $\nupdates$:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   724
	\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   725
		$\hpa \;\; (\nupdates \; s \; r \; Ss)\;\; m \leq 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   726
		 \hpa\; \; Ss \; m$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   727
	 \end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   728
	 Therefore we have that
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   729
	 \begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   730
		 $\hpower \;\; (\nupdates \; s \; r \; Ss) \leq
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   731
		  \hpower \;\; Ss$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   732
	 \end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   733
	 which leads to the lemma being proven.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   734
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   735
 \end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   736
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   737
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   738
We also define the inductive rules for
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   739
the shape of derivatives of the NTIMES regular expressions:\\[-3em]
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   740
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   741
	\begin{mathpar}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   742
		\inferrule{\mbox{}}{\cbn \;\ZERO}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   743
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   744
		\inferrule{\mbox{}}{\cbn \; \; r_a \cdot (r^{\{n\}})}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   745
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   746
		\inferrule{\cbn \; r_1 \;\; \; \cbn \; r_2}{\cbn \; r_1 + r_2}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   747
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   748
		\inferrule{\cbn \; r}{\cbn \; r + \ZERO}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   749
	\end{mathpar}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   750
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   751
\noindent
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   752
A derivative of NTIMES fits into the shape described by $\cbn$:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   753
\begin{lemma}\label{ntimesDersCbn}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   754
	$\cbn \; ((r' \cdot r^{\{n\}}) \backslash_{rs} s)$ holds.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   755
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   756
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   757
	By a reverse induction on $s$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   758
	For the inductive case, note that if $\cbn \; r$ holds,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   759
	then $\cbn \; (r\backslash_r c)$ holds.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   760
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   761
In addition, for $\cbn$-shaped regular expressioins, one can flatten
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   762
them:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   763
\begin{lemma}\label{ntimesHfauPushin}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   764
	If $\cbn \; r$ holds, then $\hflataux{r \backslash_r c} = 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   765
	\textit{concat} \; (\map \; \hflataux{\map \; (\_\backslash_r c) \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   766
	(\hflataux{r})})$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   767
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   768
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   769
	By an induction on the inductive cases of $\cbn$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   770
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   771
		
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   772
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   773
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   774
This time we do not need to define the flattening functions for NTIMES only,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   775
because $\hflat{\_}$ and $\hflataux{\_}$ work on NTIMES already:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   776
\begin{lemma}\label{ntimesHfauInduct}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   777
$\hflataux{( (r\backslash_r c) \cdot r^{\{n\}}) \backslash_{rsimps} s} = 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   778
 \map \; (\opterm \; r) \; (\nupdates \; s \; r \; [\Some \; ([c], n)])$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   779
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   780
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   781
	By a reverse induction on $s$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   782
	The lemma \ref{ntimesDersCbn} is used.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   783
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   784
\noindent
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   785
We have a recursive property for NTIMES with $\nupdate$ 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   786
similar to that for STAR,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   787
and one for $\nupdates $ as well:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   788
\begin{lemma}\label{nupdateInduct1}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   789
	(i) $\textit{concat} \; (\map \; (\hflataux{\_} \circ (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   790
	\opterm \; r)) \; Ss) = \map \; (\opterm \; r) \; (\nupdate \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   791
	c \; r \; Ss)$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   792
	(ii) $\textit{concat} \; (\map \; \hflataux{\_}\; 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   793
	\map \; (\_\backslash_r x) \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   794
		(\map \; (\opterm \; r) \; (\nupdates \; xs \; r \; Ss))) =
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   795
	\map \; (\opterm \; r) \; (\nupdates \;(xs@[x]) \; r\;Ss)$ holds.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   796
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   797
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   798
	(i) is by an induction on $Ss$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   799
	(ii) is by an induction on $xs$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   800
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   801
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   802
The $\nString$ predicate is defined for conveniently
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   803
expressing that there are no empty strings in the
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   804
$\Some \;(s, n)$ elements generated by $\nupdate$:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   805
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   806
	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   807
		$\nString \; \None$  & $\dn$ & $ \textit{true}$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   808
		$\nString \; (\Some \; ([], n))$ & $\dn$ & $ \textit{false}$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   809
		$\nString \; (\Some \; (c::s, n))$  & $\dn$ & $ \textit{true}$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   810
	\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   811
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   812
\begin{lemma}\label{nupdatesNonempty}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   813
	If for all elements $opt$ in $\textit{set} \; Ss$,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   814
	$\nString \; opt$ holds, the we have that
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   815
	for all elements $opt$ in $\textit{set} \; (\nupdates \; s \; r \; Ss)$,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   816
	$\nString \; opt$ holds.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   817
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   818
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   819
	By an induction on $s$, where $Ss$ is set to vary over all possible values.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   820
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   821
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   822
\noindent
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   823
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   824
\begin{lemma}\label{ntimesClosedFormsSteps}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   825
	The following list of equalities or rewriting relations hold:\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   826
	(i) $r^{\{n+1\}} \backslash_{rsimps} (c::s) = 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   827
	\textit{rsimp} \; (\sum (\map \; (\opterm \;r \;\_) \; (\nupdates \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   828
	s \; r \; [\Some \; ([c], n)])))$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   829
	(ii)
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   830
	\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   831
	$\sum (\map \; (\opterm \; r) \; (\nupdates \; s \; r \; [
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   832
	\Some \; ([c], n)]))$ \\ $ \sequal$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   833
	 $\sum (\map \; (\textit{rsimp} \circ (\opterm \; r))\; (\nupdates \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   834
	 s\;r \; [\Some \; ([c], n)]))$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   835
 	\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   836
	(iii)
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   837
	\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   838
	$\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   839
	([c], n)]))$\\ 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   840
	$\sequal$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   841
	 $\sum \;(\map \; (\optermsimp r) \; (\nupdates \; s \; r \; [\Some \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   842
	([c], n)])) $\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   843
	\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   844
	(iv)
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   845
	\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   846
	$\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   847
	([c], n)])) $ \\ $\sequal$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   848
	 $\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   849
	([c], n)])) $\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   850
	\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   851
	(v)
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   852
	\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   853
	 $\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   854
	 ([c], n)])) $ \\ $\sequal$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   855
	  $\sum \; (\map \; (\textit{rsimp} \circ (\opterm \; r)) \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   856
	  (\nupdates \; s \; r \; [\Some \; ([c], n)]))$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   857
  	\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   858
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   859
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   860
	Routine.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   861
	(iii) and (iv) make use of the fact that all the strings $s$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   862
	inside $\Some \; (s, m)$ which are elements of the list
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   863
	$\nupdates \; s\;r\;[\Some\; ([c], n)]$ are non-empty,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   864
	which is from lemma \ref{nupdatesNonempty}.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   865
	Once the string in $o = \Some \; (s, n)$ is 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   866
	nonempty, $\optermsimp \; r \;o$,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   867
	$\optermosimp \; r \; o$ and $\optermosimp \; \; o$ are guaranteed
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   868
	to be equal.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   869
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   870
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   871
\begin{theorem}\label{ntimesClosedForm}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   872
	The derivative of $r^{\{n+1\}}$ can be described as an alternative
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   873
	containing a list
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   874
	of terms:\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   875
	$r^{\{n+1\}} \backslash_{rsimps} (c::s) = \textit{rsimp} \; (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   876
	\sum (\map \; (\optermsimp \; r) \; (\nupdates \; s \; r \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   877
	[\Some \; ([c], n)])))$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   878
\end{theorem}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   879
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   880
	By the rewriting steps described in lemma \ref{ntimesClosedFormsSteps}.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   881
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   882
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   883
\begin{lemma}\label{nupdatesNLeqN}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   884
	For an element $o$ in $\textit{set} \; (\nupdates \; s \; r \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   885
	[\Some \; ([c], n)])$, either $o = \None$, or $o = \Some
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   886
	\; (s', m)$ for some string $s'$ and number $m \leq n$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   887
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   888
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   889
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   890
\begin{lemma}\label{ntimesClosedFormListElemShape}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   891
	For any element $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   892
	\nupdates \; s \; r \; [\Some \; ([c], n)]))$,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   893
	we have that $r'$ is either $\ZERO$ or $r \backslash_{rsimps} s' \cdot
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   894
	r^{\{m\}}$ for some string $s'$ and number $m \leq n$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   895
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   896
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   897
	Using lemma \ref{nupdatesNLeqN}.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   898
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   899
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   900
\begin{theorem}\label{ntimesClosedFormBounded}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   901
	Assuming that for any string $s$, $\llbracket r \backslash_{rsimps} s
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   902
	\rrbracket_r \leq N$ holds, then we have that\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   903
	$\llbracket r^{\{n+1\}} \backslash_{rsimps} s \rrbracket_r \leq
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   904
	\textit{max} \; (c_N+1)* (N + \llbracket r^{\{n\}} \rrbracket+1)$,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   905
	where $c_N = \textit{card} \; (\textit{sizeNregex} \; (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   906
	N + \llbracket r^{\{n\}} \rrbracket_r+1))$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   907
\end{theorem}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   908
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   909
We have that for all regular expressions $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   910
	\nupdates \; s \; r \; [\Some \; ([c], n)]))$,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   911
	$r'$'s size is less than or equal to $N + \llbracket r^{\{n\}} 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   912
	\rrbracket_r + 1$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   913
because $r'$ can only be either a $\ZERO$ or $r \backslash_{rsimps} s' \cdot
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   914
r^{\{m\}}$ for some string $s'$ and number 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   915
$m \leq n$ (lemma \ref{ntimesClosedFormListElemShape}).
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   916
In addition, we know that the list 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   917
$\map \; (\optermsimp \; r) \; (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   918
\nupdates \; s \; r \; [\Some \; ([c], n)])$'s size is at most
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   919
$c_N = \textit{card} \; 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   920
(\sizeNregex \; ((N + \llbracket r^{\{n\}} \rrbracket) + 1))$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   921
This gives us $\llbracket \llbracket r \backslash_{rsimps} \;s \llbracket_r
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   922
\leq N * c_N$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   923
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   924
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   925
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   926
Assuming we have that
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   927
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   928
	$\rderssimp{r^*}{s} = \rsimp{(\sum \map \; (\lambda s'. r\backslash s' \cdot r^*) \; Ss)}$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   929
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   930
holds.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   931
The idea of $\starupdate$ and $\starupdates$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   932
is to update $Ss$ when another
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   933
derivative is taken on $\rderssimp{r^*}{s}$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   934
w.r.t a character $c$ and a string $s'$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   935
respectively.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   936
Both $\starupdate$ and $\starupdates$ take three arguments as input:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   937
the new character $c$ or string $s$ to take derivative with, 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   938
the regular expression
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   939
$r$ under the star $r^*$, and the
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   940
list of strings $Ss$ for the derivative $r^* \backslash s$ 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   941
up until this point  
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   942
such that 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   943
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   944
$(r^*) \backslash s = \sum_{s' \in sSet} (r\backslash s') \cdot r^*$ 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   945
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   946
is satisfied.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   947
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   948
Functions $\starupdate$ and $\starupdates$ characterise what the 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   949
star derivatives will look like once ``straightened out'' into lists.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   950
The helper functions for such operations will be similar to
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   951
$\sflat{\_}$, $\sflataux{\_}$ and $\sflataux{\_}$, which we defined for sequence.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   952
We use similar symbols to denote them, with a $*$ subscript to mark the difference.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   953
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   954
	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   955
		$\hflataux{r_1 + r_2}$ & $\dn$ & $\hflataux{r_1} @ \hflataux{r_2}$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   956
		$\hflataux{r}$ & $\dn$ & $[r]$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   957
	\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   958
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   959
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   960
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   961
	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   962
		$\hflat{r_1 + r_2}$ & $\dn$ & $\sum (\hflataux {r_1} @ \hflataux {r_2}) $\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   963
		$\hflat{r}$ & $\dn$ & $r$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   964
	\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   965
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   966
\noindent
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   967
These definitions are tailor-made for dealing with alternatives that have
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   968
originated from a star's derivatives.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   969
A typical star derivative always has the structure of a balanced binary tree:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   970
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   971
	$(r_1 \backslash cc'c'' \cdot r^* + r \backslash c'') + 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   972
	(r \backslash c'c'' \cdot r^* + r \backslash c'' \cdot r^*) $ 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   973
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   974
All of the nested structures of alternatives
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   975
generated from derivatives are binary, and therefore
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   976
$\hflat{\_}$ and $\hflataux{\_}$ only deal with binary alternatives.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   977
$\hflat{\_}$ ``untangles'' like the following:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   978
\[
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   979
	\sum ((r_1 + r_2) + (r_3 + r_4))  + \ldots \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   980
	\stackrel{\hflat{\_}}{\longrightarrow} \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   981
	\RALTS{[r_1, r_2, \ldots, r_n]} 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   982
\]
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   983
Here is a lemma stating the recursive property of $\starupdate$ and $\starupdates$,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   984
with the helpers $\hflat{\_}$ and $\hflataux{\_}$\footnote{The function $\textit{concat}$ takes a list of lists 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   985
			and merges each of the element lists to form a flattened list.}:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   986
\begin{lemma}\label{stupdateInduct1}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   987
	\mbox
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   988
	For a list of strings $Ss$, the following hold.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   989
	\begin{itemize}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   990
		\item
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   991
			If we do a derivative on the terms 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   992
			$r\backslash_r s \cdot r^*$ (where $s$ is taken from the list $Ss$),
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   993
			the result will be the same as if we apply $\starupdate$ to $Ss$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   994
			\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   995
				\begin{tabular}{c}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   996
			$\textit{concat} \; (\map \; (\hflataux{\_} \circ ( (\_\backslash_r x)
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   997
			\circ (\lambda s.\;\; (r \backslash_r s) \cdot r^*)))\; Ss )\;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   998
			$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
   999
			\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1000
			$=$ \\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1001
			\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1002
			$\map \; (\lambda s. (r \backslash_r s) \cdot (r^*)) \; 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1003
			(\starupdate \; x \; r \; Ss)$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1004
				\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1005
			\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1006
		\item
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1007
			$\starupdates$ is ``composable'' w.r.t a derivative.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1008
			It piggybacks the character $x$ to the tail of the string
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1009
			$xs$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1010
			\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1011
				\begin{tabular}{c}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1012
					$\textit{concat} \; (\map \; \hflataux{\_} \; 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1013
					(\map \; (\_\backslash_r x) \; 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1014
					(\map \; (\lambda s.\;\; (r \backslash_r s) \cdot 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1015
					(r^*) ) \; (\starupdates \; xs \; r \; Ss))))$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1016
					\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1017
					$=$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1018
					\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1019
					$\map \; (\lambda s.\;\; (r\backslash_r s) \cdot (r^*)) \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1020
					(\starupdates \; (xs @ [x]) \; r \; Ss)$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1021
				\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1022
			\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1023
	\end{itemize}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1024
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1025
			
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1026
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1027
	Part 1 is by induction on $Ss$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1028
	Part 2 is by induction on $xs$, where $Ss$ is left to take arbitrary values.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1029
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1030
			
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1031
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1032
Like $\textit{createdBySequence}$, we need
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1033
a predicate for ``star-created'' regular expressions:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1034
\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1035
	\begin{mathpar}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1036
		\inferrule{\mbox{}}{ \textit{createdByStar}\; \RSEQ{ra}{\RSTAR{rb}} }
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1037
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1038
		\inferrule{  \textit{createdByStar} \; r_1\; \land  \; \textit{createdByStar} \; r_2 }{\textit{createdByStar} \; (r_1 + r_2) } 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1039
	\end{mathpar}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1040
\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1041
\noindent
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1042
All regular expressions created by taking derivatives of
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1043
$r_1 \cdot (r_2)^*$ satisfy the $\textit{createdByStar}$ predicate:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1044
\begin{lemma}\label{starDersCbs}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1045
	$\textit{createdByStar} \; ((r_1 \cdot r_2^*) \backslash_r s) $ holds.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1046
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1047
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1048
	By a reverse induction on $s$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1049
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1050
If a regular expression conforms to the shape of a star's derivative,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1051
then we can push an application of $\hflataux{\_}$ inside a derivative of it:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1052
\begin{lemma}\label{hfauPushin}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1053
	If $\textit{createdByStar} \; r$ holds, then
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1054
	\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1055
		$\hflataux{r \backslash_r c} = \textit{concat} \; (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1056
		\map \; \hflataux{\_} (\map \; (\_\backslash_r c) \;(\hflataux{r})))$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1057
	\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1058
	holds.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1059
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1060
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1061
	By an induction on the inductivev cases of $\textit{createdByStar}$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1062
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1063
%This is not entirely true for annotated regular expressions: 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1064
%%TODO: bsimp bders \neq bderssimp
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1065
%\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1066
%	$(1+ (c\cdot \ASEQ{bs}{c^*}{c} ))$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1067
%\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1068
%For bit-codes, the order in which simplification is applied
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1069
%might cause a difference in the location they are placed.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1070
%If we want something like
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1071
%\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1072
%	$\bderssimp{r}{s} \myequiv \bsimp{\bders{r}{s}}$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1073
%\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1074
%Some "canonicalization" procedure is required,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1075
%which either pushes all the common bitcodes to nodes
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1076
%as senior as possible:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1077
%\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1078
%	$_{bs}(_{bs_1 @ bs'}r_1 + _{bs_1 @ bs''}r_2) \rightarrow _{bs @ bs_1}(_{bs'}r_1 + _{bs''}r_2) $
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1079
%\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1080
%or does the reverse. However bitcodes are not of interest if we are talking about
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1081
%the $\llbracket r \rrbracket$ size of a regex.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1082
%Therefore for the ease and simplicity of producing a
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1083
%proof for a size bound, we are happy to restrict ourselves to 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1084
%unannotated regular expressions, and obtain such equalities as
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1085
%TODO: rsimp sflat
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1086
% The simplification of a flattened out regular expression, provided it comes
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1087
%from the derivative of a star, is the same as the one nested.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1088
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1089
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1090
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1091
Now we introduce an inductive property
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1092
for $\starupdate$ and $\hflataux{\_}$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1093
\begin{lemma}\label{starHfauInduct}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1094
	If we do derivatives of $r^*$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1095
	with a string that starts with $c$,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1096
	then flatten it out,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1097
	we obtain a list
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1098
	of the shape $\sum_{s' \in sS} (r\backslash_r s') \cdot r^*$,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1099
	where $sS = \starupdates \; s \; r \; [[c]]$. Namely,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1100
	\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1101
		$\hflataux{(\rders{( (\rder{c}{r_0})\cdot(r_0^*))}{s})} = 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1102
		\map \; (\lambda s_1. (r_0 \backslash_r s_1) \cdot (r_0^*)) \; 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1103
		(\starupdates \; s \; r_0 \; [[c]])$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1104
	\end{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1105
holds.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1106
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1107
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1108
	By an induction on $s$, the inductive cases
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1109
	being $[]$ and $s@[c]$. The lemmas \ref{hfauPushin} and \ref{starDersCbs} are used.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1110
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1111
\noindent
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1112
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1113
$\hflataux{\_}$ has a similar effect as $\textit{flatten}$:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1114
\begin{lemma}\label{hflatauxGrewrites}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1115
	$a :: rs \grewrites \hflataux{a} @ rs$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1116
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1117
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1118
	By induction on $a$. $rs$ is set to take arbitrary values.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1119
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1120
It is also not surprising that $\textit{rsimp}$ will cover
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1121
the effect of $\hflataux{\_}$:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1122
\begin{lemma}\label{cbsHfauRsimpeq1}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1123
	$\rsimp{(r_1 + r_2)} = \rsimp{(\RALTS{\hflataux{r_1} @ \hflataux{r_2}})}$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1124
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1125
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1126
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1127
	By using the rewriting relation $\rightsquigarrow$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1128
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1129
And from this we obtain a proof that a star's derivative will be the same
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1130
as if it had all its nested alternatives created during deriving being flattened out:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1131
For example,
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1132
\begin{lemma}\label{hfauRsimpeq2}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1133
	$\createdByStar{r} \implies \rsimp{r} = \rsimp{\RALTS{\hflataux{r}}}$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1134
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1135
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1136
	By structural induction on $r$, where the induction rules 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1137
	are these of $\createdByStar{\_}$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1138
	Lemma \ref{cbsHfauRsimpeq1} is used in the inductive case.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1139
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1140
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1141
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1142
%Here is a corollary that states the lemma in
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1143
%a more intuitive way:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1144
%\begin{corollary}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1145
%	$\hflataux{r^* \backslash_r (c::xs)} = \map \; (\lambda s. (r \backslash_r s) \cdot
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1146
%	(r^*))\; (\starupdates \; c\; r\; [[c]])$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1147
%\end{corollary}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1148
%\noindent
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1149
%Note that this is also agnostic of the simplification
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1150
%function we defined, and is therefore of more general interest.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1151
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1152
Together with the rewriting relation
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1153
\begin{lemma}\label{starClosedForm6Hrewrites}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1154
	We have the following set of rewriting relations or equalities:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1155
	\begin{itemize}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1156
		\item
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1157
			$\textit{rsimp} \; (r^* \backslash_r (c::s)) 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1158
			\sequal
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1159
			\sum \; ( ( \sum (\lambda s. (r\backslash_r s) \cdot r^*) \; (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1160
			\starupdates \; s \; r \; [ c::[]] ) ) )$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1161
		\item
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1162
			$r \backslash_{rsimp} (c::s) = \textit{rsimp} \; ( (
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1163
			\sum ( (\map \; (\lambda s_1. (r\backslash s_1) \; r^*) \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1164
			(\starupdates \;s \; r \; [ c::[] ])))))$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1165
		\item
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1166
			$\sum ( (\map \; (\lambda s. (r\backslash s) \; r^*) \; Ss))
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1167
			\sequal
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1168
			 \sum ( (\map \; (\lambda s. \textit{rsimp} \; (r\backslash s) \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1169
			 r^*) \; Ss) )$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1170
		\item
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1171
			$\map \; (\lambda s. (\rsimp{r \backslash_r s}) \cdot (r^*)) \; Ss
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1172
			\scfrewrites
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1173
			\map \; (\lambda s. (\rsimp{r \backslash_r s}) \cdot (r^*)) \; Ss$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1174
		\item
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1175
			$( ( \sum ( ( \map \ (\lambda s. \;\;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1176
			(\textit{rsimp} \; (r \backslash_r s)) \cdot r^*) \; (\starupdates \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1177
			s \; r \; [ c::[] ])))))$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1178
			$\sequal$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1179
			$( ( \sum ( ( \map \ (\lambda s. \;\;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1180
			( r \backslash_{rsimp} s)) \cdot r^*) \; (\starupdates \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1181
			s \; r \; [ c::[] ]))))$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1182
	\end{itemize}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1183
\end{lemma}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1184
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1185
	Part 1 leads to part 2.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1186
	The rest of them are routine.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1187
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1188
\noindent
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1189
Next the closed form for star regular expressions can be derived:
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1190
\begin{theorem}\label{starClosedForm}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1191
	$\rderssimp{r^*}{c::s} = 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1192
	\rsimp{
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1193
		(\sum (\map \; (\lambda s. (\rderssimp{r}{s})\cdot r^*) \; 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1194
		(\starupdates \; s\; r \; [[c]])
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1195
		)
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1196
		)
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1197
	}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1198
	$
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1199
\end{theorem}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1200
\begin{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1201
	By an induction on $s$.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1202
	The lemmas \ref{rsimpIdem}, \ref{starHfauInduct}, \ref{starClosedForm6Hrewrites} 
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1203
	and \ref{hfauRsimpeq2}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1204
	are used.	
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1205
	In \ref{starClosedForm6Hrewrites}, the equalities are
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1206
	used to link the LHS and RHS.
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1207
\end{proof}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1208
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1209
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1210
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1211
612
Chengsong
parents: 596
diff changeset
  1212
The closed form for them looks like:
Chengsong
parents: 596
diff changeset
  1213
%\begin{center}
Chengsong
parents: 596
diff changeset
  1214
%	\begin{tabular}{llrclll}
Chengsong
parents: 596
diff changeset
  1215
%		$r^{\{n+1\}}$ & $ \backslash_{rsimps}$ & $(c::s)$ & $=$ & & \\
Chengsong
parents: 596
diff changeset
  1216
%		$\textit{rsimp}$ & $($ & $
Chengsong
parents: 596
diff changeset
  1217
%		\sum \; ( $ & $\map$ & $(\textit{optermsimp}\;r)$ & $($\\
Chengsong
parents: 596
diff changeset
  1218
%			 & & & & $\textit{nupdates} \;$ & 
Chengsong
parents: 596
diff changeset
  1219
%			 $ s \; r_0 \; [ \textit{Some} \; ([c], n)]$\\
Chengsong
parents: 596
diff changeset
  1220
%			 & & & & $)$ &\\
Chengsong
parents: 596
diff changeset
  1221
%			 & &  $)$ & & &\\
Chengsong
parents: 596
diff changeset
  1222
%			 & $)$ & & & &\\
Chengsong
parents: 596
diff changeset
  1223
%	\end{tabular}
Chengsong
parents: 596
diff changeset
  1224
%\end{center}
Chengsong
parents: 596
diff changeset
  1225
\begin{center}
Chengsong
parents: 596
diff changeset
  1226
	\begin{tabular}{llrcllrllll}
Chengsong
parents: 596
diff changeset
  1227
		$r^{\{n+1\}}$ & $ \backslash_{rsimps}$ & $(c::s)$ & $=$ & & &&&&\\
Chengsong
parents: 596
diff changeset
  1228
			      &&&&$\textit{rsimp}$ & $($ & $
Chengsong
parents: 596
diff changeset
  1229
			      \sum \; ( $ & $\map$ & $(\textit{optermsimp}\;r)$ & $($\\
Chengsong
parents: 596
diff changeset
  1230
					  &&&& & & & & $\;\; \textit{nupdates} \;$ & 
Chengsong
parents: 596
diff changeset
  1231
			 		  $ s \; r_0 \; [ \textit{Some} \; ([c], n)]$\\
Chengsong
parents: 596
diff changeset
  1232
					  &&&& & & & & $)$ &\\
Chengsong
parents: 596
diff changeset
  1233
					  &&&& & &  $)$ & & &\\
Chengsong
parents: 596
diff changeset
  1234
					  &&&& & $)$ & & & &\\
Chengsong
parents: 596
diff changeset
  1235
	\end{tabular}
Chengsong
parents: 596
diff changeset
  1236
\end{center}
613
Chengsong
parents: 612
diff changeset
  1237
The $\textit{optermsimp}$ function with the argument $r$ 
Chengsong
parents: 612
diff changeset
  1238
chooses from two options: $\ZERO$ or 
612
Chengsong
parents: 596
diff changeset
  1239
We define for the $r^{\{n\}}$ constructor something similar to $\starupdate$
596
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1240
and $\starupdates$:
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1241
\begin{center}
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1242
	\begin{tabular}{lcl}
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1243
		$\starupdate \; c \; r \; [] $ & $\dn$ & $[]$\\
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1244
		$\starupdate \; c \; r \; (s :: Ss)$ & $\dn$ & \\
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1245
						     & & $\textit{if} \; 
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1246
						     (\rnullable \; (\rders \; r \; s))$ \\
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1247
						     & & $\textit{then} \;\; (s @ [c]) :: [c] :: (
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1248
						     \starupdate \; c \; r \; Ss)$ \\
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1249
						     & & $\textit{else} \;\; (s @ [c]) :: (
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1250
						     \starupdate \; c \; r \; Ss)$
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1251
	\end{tabular}
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1252
\end{center}
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1253
\noindent
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1254
As a generalisation from characters to strings,
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1255
$\starupdates$ takes a string instead of a character
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1256
as the first input argument, and is otherwise the same
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1257
as $\starupdate$.
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1258
\begin{center}
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1259
	\begin{tabular}{lcl}
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1260
		$\starupdates \; [] \; r \; Ss$ & $=$ & $Ss$\\
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1261
		$\starupdates \; (c :: cs) \; r \; Ss$ &  $=$ &  $\starupdates \; cs \; r \; (
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1262
		\starupdate \; c \; r \; Ss)$
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1263
	\end{tabular}
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1264
\end{center}
b306628a0eab more chap 56
Chengsong
parents: 594
diff changeset
  1265
\noindent
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1266
620
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1267
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1268
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1269
612
Chengsong
parents: 596
diff changeset
  1270
%----------------------------------------------------------------------------------------
Chengsong
parents: 596
diff changeset
  1271
%	SECTION 1
Chengsong
parents: 596
diff changeset
  1272
%----------------------------------------------------------------------------------------
532
cc54ce075db5 restructured
Chengsong
parents:
diff changeset
  1273
612
Chengsong
parents: 596
diff changeset
  1274
%\section{Adding Support for the Negation Construct, and its Correctness Proof}
Chengsong
parents: 596
diff changeset
  1275
%We now add support for the negation regular expression:
Chengsong
parents: 596
diff changeset
  1276
%\[			r ::=   \ZERO \mid  \ONE
Chengsong
parents: 596
diff changeset
  1277
%			 \mid  c  
Chengsong
parents: 596
diff changeset
  1278
%			 \mid  r_1 \cdot r_2
Chengsong
parents: 596
diff changeset
  1279
%			 \mid  r_1 + r_2   
Chengsong
parents: 596
diff changeset
  1280
%			 \mid r^*    
Chengsong
parents: 596
diff changeset
  1281
%			 \mid \sim r
Chengsong
parents: 596
diff changeset
  1282
%\]
Chengsong
parents: 596
diff changeset
  1283
%The $\textit{nullable}$ function's clause for it would be 
Chengsong
parents: 596
diff changeset
  1284
%\[
Chengsong
parents: 596
diff changeset
  1285
%\textit{nullable}(~r) = \neg \nullable(r)
Chengsong
parents: 596
diff changeset
  1286
%\]
Chengsong
parents: 596
diff changeset
  1287
%The derivative would be
Chengsong
parents: 596
diff changeset
  1288
%\[
Chengsong
parents: 596
diff changeset
  1289
%~r \backslash c = ~ (r \backslash c)
Chengsong
parents: 596
diff changeset
  1290
%\]
Chengsong
parents: 596
diff changeset
  1291
% 
Chengsong
parents: 596
diff changeset
  1292
%The most tricky part of lexing for the $~r$ regular expression
Chengsong
parents: 596
diff changeset
  1293
% is creating a value for it.
Chengsong
parents: 596
diff changeset
  1294
% For other regular expressions, the value aligns with the 
Chengsong
parents: 596
diff changeset
  1295
% structure of the regular expression:
Chengsong
parents: 596
diff changeset
  1296
% \[
Chengsong
parents: 596
diff changeset
  1297
% \vdash \Seq(\Char(a), \Char(b)) : a \cdot b
Chengsong
parents: 596
diff changeset
  1298
% \]
Chengsong
parents: 596
diff changeset
  1299
%But for the $~r$ regular expression, $s$ is a member of it if and only if
Chengsong
parents: 596
diff changeset
  1300
%$s$ does not belong to $L(r)$. 
Chengsong
parents: 596
diff changeset
  1301
%That means when there
Chengsong
parents: 596
diff changeset
  1302
%is a match for the not regular expression, it is not possible to generate how the string $s$ matched
Chengsong
parents: 596
diff changeset
  1303
%with $r$. 
Chengsong
parents: 596
diff changeset
  1304
%What we can do is preserve the information of how $s$ was not matched by $r$,
Chengsong
parents: 596
diff changeset
  1305
%and there are a number of options to do this.
Chengsong
parents: 596
diff changeset
  1306
%
Chengsong
parents: 596
diff changeset
  1307
%We could give a partial value when there is a partial match for the regular expression inside
Chengsong
parents: 596
diff changeset
  1308
%the $\mathbf{not}$ construct.
Chengsong
parents: 596
diff changeset
  1309
%For example, the string $ab$ is not in the language of $(a\cdot b) \cdot c$,
Chengsong
parents: 596
diff changeset
  1310
%A value for it could be 
Chengsong
parents: 596
diff changeset
  1311
% \[
Chengsong
parents: 596
diff changeset
  1312
% \vdash \textit{Not}(\Seq(\Char(a), \Char(b))) : ~((a \cdot b ) \cdot c)
Chengsong
parents: 596
diff changeset
  1313
% \]
Chengsong
parents: 596
diff changeset
  1314
% The above example demonstrates what value to construct 
Chengsong
parents: 596
diff changeset
  1315
% when the string $s$ is at most a real prefix
Chengsong
parents: 596
diff changeset
  1316
% of the strings in $L(r)$. When $s$ instead is not a prefix of any strings
Chengsong
parents: 596
diff changeset
  1317
% in $L(r)$, it becomes unclear what to return as a value inside the $\textit{Not}$
Chengsong
parents: 596
diff changeset
  1318
% constructor.
Chengsong
parents: 596
diff changeset
  1319
% 
Chengsong
parents: 596
diff changeset
  1320
% Another option would be to either store the string $s$ that resulted in 
Chengsong
parents: 596
diff changeset
  1321
% a mis-match for $r$ or a dummy value as a placeholder:
Chengsong
parents: 596
diff changeset
  1322
% \[
Chengsong
parents: 596
diff changeset
  1323
% \vdash \textit{Not}(abcd) : ~( r_1 )
Chengsong
parents: 596
diff changeset
  1324
% \]
Chengsong
parents: 596
diff changeset
  1325
%or
Chengsong
parents: 596
diff changeset
  1326
% \[
Chengsong
parents: 596
diff changeset
  1327
% \vdash \textit{Not}(\textit{Dummy}) : ~( r_1 )
Chengsong
parents: 596
diff changeset
  1328
% \] 
Chengsong
parents: 596
diff changeset
  1329
% We choose to implement this as it is most straightforward:
Chengsong
parents: 596
diff changeset
  1330
% \[
Chengsong
parents: 596
diff changeset
  1331
% \mkeps(~(r))  = \textit{if}(\nullable(r)) \; \textit{Error} \; \textit{else} \; \textit{Not}(\textit{Dummy})
Chengsong
parents: 596
diff changeset
  1332
% \]
Chengsong
parents: 596
diff changeset
  1333
% 
Chengsong
parents: 596
diff changeset
  1334
%
620
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1335
%\begin{center}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1336
%	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1337
%		$\ntset \; r \; (n+1) \; c::cs $ & $\dn$ & $\nupdates \;
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1338
%		cs \; r \; [\Some \; ([c], n)]$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1339
%		$\ntset \; r\; 0 \; \_$ &  $\dn$ &  $\None$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1340
%		$\ntset \; r \; \_ \; [] $ & $ \dn$ & $[]$\\
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1341
%	\end{tabular}
ae6010c14e49 chap6 almost done
Chengsong
parents: 613
diff changeset
  1342
%\end{center}