lexing: ChengsongTanPhdThesis/Chapters/Finite.tex@2e05f04ed6b3 (annotated)

532 cc54ce075db5 restructured Chengsong parents: diff changeset	1	% Chapter Template
cc54ce075db5 restructured Chengsong parents: diff changeset	2
cc54ce075db5 restructured Chengsong parents: diff changeset	3	\chapter{Finiteness Bound} % Main chapter title
cc54ce075db5 restructured Chengsong parents: diff changeset	4
cc54ce075db5 restructured Chengsong parents: diff changeset	5	\label{Finite}
cc54ce075db5 restructured Chengsong parents: diff changeset	6	% In Chapter 4 \ref{Chapter4} we give the second guarantee
cc54ce075db5 restructured Chengsong parents: diff changeset	7	%of our bitcoded algorithm, that is a finite bound on the size of any
cc54ce075db5 restructured Chengsong parents: diff changeset	8	%regex's derivatives.
cc54ce075db5 restructured Chengsong parents: diff changeset	9
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	10
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	11	In this chapter we give a bound in terms of the size of
624 8ffa28fce271 all comments incorporated!!+related work Chengsong parents: 620 diff changeset	12	the calculated derivatives:
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	13	given an annotated regular expression $a$, for any string $s$
624 8ffa28fce271 all comments incorporated!!+related work Chengsong parents: 620 diff changeset	14	our algorithm $\blexersimp$'s derivatives
8ffa28fce271 all comments incorporated!!+related work Chengsong parents: 620 diff changeset	15	are finitely bounded
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	16	by a constant that only depends on $a$.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	17	Formally we show that there exists an $N_a$ such that
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	18	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	19	$\llbracket \bderssimp{a}{s} \rrbracket \leq N_a$
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	20	\end{center}
3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	21	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	22	where the size ($\llbracket \_ \rrbracket$) of
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	23	an annotated regular expression is defined
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	24	in terms of the number of nodes in its
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	25	tree structure (its recursive definition is given in the next page).
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	26	We believe this size bound
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	27	is important in the context of POSIX lexing because
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	28	\begin{itemize}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	29	\item
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	30	It is a stepping stone towards the goal
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	31	of eliminating ``catastrophic backtracking''.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	32	If the internal data structures used by our algorithm
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	33	grows beyond a finite bound, then clearly
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	34	the algorithm (which traverses these structures) will
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	35	be slow.
624 8ffa28fce271 all comments incorporated!!+related work Chengsong parents: 620 diff changeset	36	The next step is to refine the bound $N_a$ so that it
8ffa28fce271 all comments incorporated!!+related work Chengsong parents: 620 diff changeset	37	is not just finite but polynomial in $\llbracket a\rrbracket$.
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	38	\item
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	39	Having the finite bound formalised
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	40	gives us higher confidence that
bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	41	our simplification algorithm $\simp$ does not ``misbehave''
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	42	like $\textit{simpSL}$ does.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	43	The bound is universal for a given regular expression,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	44	which is an advantage over work which
624 8ffa28fce271 all comments incorporated!!+related work Chengsong parents: 620 diff changeset	45	only gives empirical evidence on
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	46	some test cases (see for example Verbatim work \cite{Verbatimpp}).
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	47	\end{itemize}
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	48	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	49	We then extend our $\blexersimp$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	50	to support bounded repetitions ($r^{\{n\}}$).
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	51	We update our formalisation of
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	52	the correctness and finiteness properties to
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	53	include this new construct.
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	54	We show that we can out-compete other verified lexers such as
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	55	Verbatim++ on bounded regular expressions.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	56
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	57	In the next section we describe in more detail
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	58	what the finite bound means in our algorithm
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	59	and why the size of the internal data structures of
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	60	a typical derivative-based lexer such as
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	61	Sulzmann and Lu's needs formal treatment.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	62
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	63
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	64
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	65
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	66	\section{Formalising Size Bound of Derivatives}
577 f47fc4840579 thesis chap2 Chengsong parents: 576 diff changeset	67	\noindent
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	68	In our lexer ($\blexersimp$),
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	69	we take an annotated regular expression as input,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	70	and repeately take derivative of and simplify it.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	71	\begin{figure}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	72	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	73	\begin{tabular}{lcl}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	74	$\llbracket _{bs}\ONE \rrbracket$ & $\dn$ & $1$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	75	$\llbracket \ZERO \rrbracket$ & $\dn$ & $1$ \\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	76	$\llbracket _{bs} r_1 \cdot r_2 \rrbracket$ & $\dn$ & $\llbracket r_1 \rrbracket + \llbracket r_2 \rrbracket + 1$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	77	$\llbracket _{bs}\mathbf{c} \rrbracket $ & $\dn$ & $1$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	78	$\llbracket _{bs}\sum as \rrbracket $ & $\dn$ & $\map \; (\llbracket \_ \rrbracket)\; as + 1$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	79	$\llbracket _{bs} a^* \rrbracket $ & $\dn$ & $\llbracket a \rrbracket + 1$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	80	\end{tabular}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	81	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	82	\caption{The size function of bitcoded regular expressions}\label{brexpSize}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	83	\end{figure}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	84
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	85	\begin{figure}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	86	\begin{tikzpicture}[scale=2,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	87	every node/.style={minimum size=11mm},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	88	->,>=stealth',shorten >=1pt,auto,thick
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	89	]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	90	\node (r0) [rectangle, draw=black, thick, minimum size = 5mm, draw=blue] {$a$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	91	\node (r1) [rectangle, draw=black, thick, right=of r0, minimum size = 7mm]{$a_1$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	92	\draw[->,line width=0.2mm](r0)--(r1) node[above,midway] {$\backslash c_1$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	93
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	94	\node (r1s) [rectangle, draw=blue, thick, right=of r1, minimum size=6mm]{$a_{1s}$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	95	\draw[->, line width=0.2mm](r1)--(r1s) node[above, midway] {$\simp$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	96
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	97	\node (r2) [rectangle, draw=black, thick, right=of r1s, minimum size = 12mm]{$a_2$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	98	\draw[->,line width=0.2mm](r1s)--(r2) node[above,midway] {$\backslash c_2$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	99
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	100	\node (r2s) [rectangle, draw = blue, thick, right=of r2,minimum size=6mm]{$a_{2s}$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	101	\draw[->,line width=0.2mm](r2)--(r2s) node[above,midway] {$\simp$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	102
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	103	\node (rns) [rectangle, draw = blue, thick, right=of r2s,minimum size=6mm]{$a_{ns}$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	104	\draw[->,line width=0.2mm, dashed](r2s)--(rns) node[above,midway] {$\backslash \ldots$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	105
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	106	\node (v) [circle, thick, draw, right=of rns, minimum size=6mm, right=1.7cm]{$v$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	107	\draw[->, line width=0.2mm](rns)--(v) node[above, midway] {\bmkeps} node [below, midway] {\decode};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	108	\end{tikzpicture}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	109	\caption{Regular expression size change during our $\blexersimp$ algorithm}\label{simpShrinks}
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	110	\end{figure}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	111
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	112	\noindent
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	113	Each time
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	114	a derivative is taken, the regular expression might grow.
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	115	However, the simplification that is immediately afterwards will often shrink it so that
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	116	the overall size of the derivatives stays relatively small.
577 f47fc4840579 thesis chap2 Chengsong parents: 576 diff changeset	117	This intuition is depicted by the relative size
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	118	change between the black and blue nodes:
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	119	After $\simp$ the node shrinks.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	120	Our proof states that all the blue nodes
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	121	stay below a size bound $N_a$ determined by the input $a$.
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	122
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	123	\noindent
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	124	Sulzmann and Lu's assumed a similar picture of their algorithm,
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	125	though in fact their algorithm's size might be better depicted by the following graph:
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	126	\begin{figure}[H]
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	127	\begin{tikzpicture}[scale=2,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	128	every node/.style={minimum size=11mm},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	129	->,>=stealth',shorten >=1pt,auto,thick
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	130	]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	131	\node (r0) [rectangle, draw=black, thick, minimum size = 5mm, draw=blue] {$a$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	132	\node (r1) [rectangle, draw=black, thick, right=of r0, minimum size = 7mm]{$a_1$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	133	\draw[->,line width=0.2mm](r0)--(r1) node[above,midway] {$\backslash c_1$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	134
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	135	\node (r1s) [rectangle, draw=blue, thick, right=of r1, minimum size=7mm]{$a_{1s}$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	136	\draw[->, line width=0.2mm](r1)--(r1s) node[above, midway] {$\simp'$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	137
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	138	\node (r2) [rectangle, draw=black, thick, right=of r1s, minimum size = 17mm]{$a_2$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	139	\draw[->,line width=0.2mm](r1s)--(r2) node[above,midway] {$\backslash c_2$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	140
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	141	\node (r2s) [rectangle, draw = blue, thick, right=of r2,minimum size=14mm]{$a_{2s}$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	142	\draw[->,line width=0.2mm](r2)--(r2s) node[above,midway] {$\simp'$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	143
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	144	\node (r3) [rectangle, draw = black, thick, right= of r2s, minimum size = 22mm]{$a_3$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	145	\draw[->,line width=0.2mm](r2s)--(r3) node[above,midway] {$\backslash c_3$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	146
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	147	\node (rns) [right = of r3, draw=blue, minimum size = 20mm]{$a_{3s}$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	148	\draw[->,line width=0.2mm] (r3)--(rns) node [above, midway] {$\simp'$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	149
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	150	\node (rnn) [right = of rns, minimum size = 1mm]{};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	151	\draw[->, dashed] (rns)--(rnn) node [above, midway] {$\ldots$};
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	152
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	153	\end{tikzpicture}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	154	\caption{Regular expression size change during our $\blexersimp$ algorithm}\label{sulzShrinks}
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	155	\end{figure}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	156	\noindent
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	157	The picture means that in some cases their lexer (where they use $\simpsulz$
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	158	as the simplification function)
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	159	will have a size explosion, causing the running time
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	160	of each derivative step to grow continuously (for example
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	161	in \ref{SulzmannLuLexerTime}).
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	162	They tested out the run time of their
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	163	lexer on particular examples such as $(a+b+ab)^*$
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	164	and claimed that their algorithm is linear w.r.t to the input.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	165	With our mechanised proof, we avoid this type of unintentional
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	166	generalisation.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	167
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	168	Before delving into the details of the formalisation,
bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	169	we are going to provide an overview of it in the following subsection.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	170
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	171
577 f47fc4840579 thesis chap2 Chengsong parents: 576 diff changeset	172	\subsection{Overview of the Proof}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	173	A high-level overview of the main components of the finiteness proof
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	174	is as follows:
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	175	\begin{figure}[H]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	176	\begin{tikzpicture}[scale=1,font=\bf,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	177	node/.style={
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	178	rectangle,rounded corners=3mm,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	179	ultra thick,draw=black!50,minimum height=18mm,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	180	minimum width=20mm,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	181	top color=white,bottom color=black!20}]
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	182
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	183
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	184	\node (0) at (-5,0)
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	185	[node, text width=1.8cm, text centered]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	186	{$\llbracket \bderssimp{a}{s} \rrbracket$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	187	\node (A) at (0,0)
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	188	[node,text width=1.6cm, text centered]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	189	{$\llbracket \rderssimp{r}{s} \rrbracket_r$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	190	\node (B) at (3,0)
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	191	[node,text width=3.0cm, anchor=west, minimum width = 40mm]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	192	{$\llbracket \textit{ClosedForm}(r, s)\rrbracket_r$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	193	\node (C) at (9.5,0) [node, minimum width=10mm] {$N_r$};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	194
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	195	\draw [->,line width=0.5mm] (0) --
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	196	node [above,pos=0.45] {=} (A) node [below, pos = 0.45] {$(r = a \downarrow_r)$} (A);
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	197	\draw [->,line width=0.5mm] (A) --
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	198	node [above,pos=0.35] {$\quad =\ldots=$} (B);
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	199	\draw [->,line width=0.5mm] (B) --
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	200	node [above,pos=0.35] {$\quad \leq \ldots \leq$} (C);
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	201	\end{tikzpicture}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	202	%\caption{
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	203	\end{figure}
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	204	\noindent
577 f47fc4840579 thesis chap2 Chengsong parents: 576 diff changeset	205	We explain the steps one by one:
532 cc54ce075db5 restructured Chengsong parents: diff changeset	206	\begin{itemize}
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	207	\item
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	208	We first introduce the operations such as
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	209	derivatives, simplification, size calculation, etc.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	210	associated with $\rrexp$s, which we have introduced
659 2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	211	in chapter \ref{Bitcoded2}. As promised we will discuss
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	212	why they are needed in \ref{whyRerase}.
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	213	The operations on $\rrexp$s are identical to those on
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	214	annotated regular expressions except that they dispense with
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	215	bitcodes. This means that all proofs about size of $\rrexp$s will apply to
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	216	annotated regular expressions, because the size of a regular
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	217	expression is independent of the bitcodes.
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	218	\item
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	219	We prove that $\rderssimp{r}{s} = \textit{ClosedForm}(r, s)$,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	220	where $\textit{ClosedForm}(r, s)$ is entirely
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	221	given as the derivatives of their children regular
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	222	expressions.
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	223	We call the right-hand-side the \emph{Closed Form}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	224	of the derivative $\rderssimp{r}{s}$.
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	225	\item
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	226	Formally we give an estimate of
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	227	$\llbracket \textit{ClosedForm}(r, s) \rrbracket_r$.
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	228	The key observation is that $\distinctBy$'s output is
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	229	a list with a constant length bound.
532 cc54ce075db5 restructured Chengsong parents: diff changeset	230	\end{itemize}
594 62f8fa03863e more Chengsong parents: 593 diff changeset	231	We will expand on these steps in the next sections.\\
532 cc54ce075db5 restructured Chengsong parents: diff changeset	232
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	233	\section{The $\textit{Rrexp}$ Datatype}
594 62f8fa03863e more Chengsong parents: 593 diff changeset	234	The first step is to define
62f8fa03863e more Chengsong parents: 593 diff changeset	235	$\textit{rrexp}$s.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	236	They are annotated regular expressions without bitcodes,
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	237	allowing a more convenient size bound proof.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	238	%Of course, the bits which encode the lexing information
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	239	%would grow linearly with respect
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	240	%to the input, which should be taken into accounte when we wish to tackle the runtime complexity.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	241	%But for the sake of the structural size
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	242	%we can safely ignore them.\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	243	The datatype
594 62f8fa03863e more Chengsong parents: 593 diff changeset	244	definition of the $\rrexp$, called
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	245	\emph{r-regular expressions},
594 62f8fa03863e more Chengsong parents: 593 diff changeset	246	was initially defined in \ref{rrexpDef}.
62f8fa03863e more Chengsong parents: 593 diff changeset	247	The reason for the prefix $r$ is
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	248	to make a distinction
594 62f8fa03863e more Chengsong parents: 593 diff changeset	249	with basic regular expressions.
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	250	We give here again the definition of $\rrexp$.
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	251	\[ \rrexp ::= \RZERO \mid \RONE
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	252	\mid \RCHAR{c}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	253	\mid \RSEQ{r_1}{r_2}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	254	\mid \RALTS{rs}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	255	\mid \RSTAR{r}
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	256	\]
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	257	The size of an r-regular expression is
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	258	written $\llbracket r\rrbracket_r$,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	259	whose definition mirrors that of an annotated regular expression.
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	260	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	261	\begin{tabular}{lcl}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	262	$\llbracket _{bs}\ONE \rrbracket_r$ & $\dn$ & $1$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	263	$\llbracket \ZERO \rrbracket_r$ & $\dn$ & $1$ \\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	264	$\llbracket _{bs} r_1 \cdot r_2 \rrbracket_r$ & $\dn$ & $\llbracket r_1 \rrbracket_r + \llbracket r_2 \rrbracket_r + 1$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	265	$\llbracket _{bs}\mathbf{c} \rrbracket_r $ & $\dn$ & $1$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	266	$\llbracket _{bs}\sum as \rrbracket_r $ & $\dn$ & $\map \; (\llbracket \_ \rrbracket_r)\; as + 1$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	267	$\llbracket _{bs} a^* \rrbracket_r $ & $\dn$ & $\llbracket a \rrbracket_r + 1$.
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	268	\end{tabular}
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	269	\end{center}
3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	270	\noindent
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	271	The $r$ in the subscript of $\llbracket \rrbracket_r$ is to
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	272	differentiate with the same operation for annotated regular expressions.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	273	Similar subscripts will be added for operations like $\rerase{}$:
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	274	\begin{center}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	275	\begin{tabular}{lcl}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	276	$\rerase{\ZERO}$ & $\dn$ & $\RZERO$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	277	$\rerase{_{bs}\ONE}$ & $\dn$ & $\RONE$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	278	$\rerase{_{bs}\mathbf{c}}$ & $\dn$ & $\RCHAR{c}$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	279	$\rerase{_{bs}r_1\cdot r_2}$ & $\dn$ & $\RSEQ{\rerase{r_1}}{\rerase{r_2}}$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	280	$\rerase{_{bs}\sum as}$ & $\dn$ & $\RALTS{\map \; \rerase{\_} \; as}$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	281	$\rerase{_{bs} a ^}$ & $\dn$ & $\rerase{a} ^$
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	282	\end{tabular}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	283	\end{center}
594 62f8fa03863e more Chengsong parents: 593 diff changeset	284
659 2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	285	\subsection{Why a New Datatype?}\label{whyRerase}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	286	\marginpar{\em added label so this section can be referenced by other parts of the thesis
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	287	so that interested readers can jump to/be reassured that there will explanations.}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	288	Originally the erase operation $(\_)_\downarrow$ was
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	289	used by Ausaf et al. in their proofs related to $\blexer$.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	290	This function was not part of the lexing algorithm, and the sole purpose was to
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	291	bridge the gap between the $r$
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	292	%$\textit{rexp}$
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	293	(un-annotated) and $\textit{arexp}$ (annotated)
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	294	regular expression datatypes so as to leverage the correctness
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	295	theorem of $\lexer$.%to establish the correctness of $\blexer$.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	296	For example, lemma \ref{retrieveStepwise} %and \ref{bmkepsRetrieve}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	297	uses $\erase$ to convert an annotated regular expression $a$ into
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	298	a plain one so that it can be used by $\inj$ to create the desired value
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	299	$\inj\; (a)_\downarrow \; c \; v$.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	300
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	301	Ideally $\erase$ should only remove the auxiliary information not related to the
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	302	structure--the
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	303	bitcodes. However there exists a complication
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	304	where the alternative constructors have different arity for $\textit{arexp}$
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	305	and $\textit{r}$:
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	306	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	307	\begin{tabular}{lcl}
659 2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	308	$\textit{r}$ & $::=$ & $\ldots \;\|\; (\_ + \_) \; ::\; "\textit{r} \Rightarrow \textit{r} \Rightarrow \textit{r}" \| \ldots$\\
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	309	$\textit{arexp}$ & $::=$ & $\ldots\; \|\; (\Sigma \_ ) \; ::\; "\textit{arexp} \; list \Rightarrow \textit{arexp}" \| \ldots$
594 62f8fa03863e more Chengsong parents: 593 diff changeset	310	\end{tabular}
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	311	\end{center}
594 62f8fa03863e more Chengsong parents: 593 diff changeset	312	\noindent
659 2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	313	To convert between the two
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	314	$\erase$ has to recursively disassemble a list into nested binary applications of the
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	315	$(\_ + \_)$ operator,
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	316	handling corner cases like empty or
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	317	singleton alternative lists:
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	318	%becomes $r$ during the
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	319	%$\erase$ function.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	320	%The annotated regular expression $\sum[a, b, c]$ would turn into
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	321	%$(a+(b+c))$.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	322	\begin{center}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	323	\begin{tabular}{lcl}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	324	$ (_{bs}\sum [])_\downarrow $ & $\dn$ & $\ZERO$\\
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	325	$ (_{bs}\sum [a])_\downarrow$ & $\dn$ & $a$\\
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	326	$ (_{bs}\sum a_1 :: a_2)_\downarrow$ & $\dn$ & $(a_1)_\downarrow + (a_2)_\downarrow)$\\
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	327	$ (_{bs}\sum a :: as)_\downarrow$ & $\dn$ & $a_\downarrow + (\erase \; _{[]} \sum as)$
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	328	\end{tabular}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	329	\end{center}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	330	\noindent
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	331	These operations inevitably change the structure and size of
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	332	an annotated regular expression. For example,
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	333	$a_1 = \sum _{Z}[x]$ has size 2, but $(a_1)_\downarrow = x$
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	334	only has size 1.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	335	%adding unnecessary
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	336	%complexities to the size bound proof.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	337	%The reason we
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	338	%define a new datatype is that
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	339	%the $\erase$ function
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	340	%does not preserve the structure of annotated
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	341	%regular expressions.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	342	%We initially started by using
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	343	%plain regular expressions and tried to prove
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	344	%lemma \ref{rsizeAsize},
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	345	%however the $\erase$ function messes with the structure of the
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	346	%annotated regular expression.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	347	%The $+$ constructor
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	348	%of basic regular expressions is only binary, whereas $\sum$
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	349	%takes a list. Therefore we need to convert between
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	350	%annotated and normal regular expressions as follows:
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	351	For example, if we define the size of a basic plain regular expression
594 62f8fa03863e more Chengsong parents: 593 diff changeset	352	in the usual way,
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	353	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	354	\begin{tabular}{lcl}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	355	$\llbracket \ONE \rrbracket_p$ & $\dn$ & $1$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	356	$\llbracket \ZERO \rrbracket_p$ & $\dn$ & $1$ \\
659 2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	357	$\llbracket r_1 + r_2 \rrbracket_p$ & $\dn$ & $\llbracket r_1 \rrbracket_p + \llbracket r_2 \rrbracket_p + 1$\\
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	358	$\llbracket \mathbf{c} \rrbracket_p $ & $\dn$ & $1$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	359	$\llbracket r_1 \cdot r_2 \rrbracket_p $ & $\dn$ & $\llbracket r_1 \rrbracket_p \; + \llbracket r_2 \rrbracket_p + 1$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	360	$\llbracket a^* \rrbracket_p $ & $\dn$ & $\llbracket a \rrbracket_p + 1$
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	361	\end{tabular}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	362	\end{center}
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	363	\noindent
594 62f8fa03863e more Chengsong parents: 593 diff changeset	364	Then the property
532 cc54ce075db5 restructured Chengsong parents: diff changeset	365	\begin{center}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	366	$\llbracket a \rrbracket \stackrel{?}{=} \llbracket a_\downarrow \rrbracket_p$
532 cc54ce075db5 restructured Chengsong parents: diff changeset	367	\end{center}
594 62f8fa03863e more Chengsong parents: 593 diff changeset	368	does not hold.
659 2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	369	%With $\textit{rerase}$, however,
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	370	%only the bitcodes are thrown away.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	371	That leads to us defining the new regular expression datatype without
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	372	bitcodes but with a list alternative constructor, and defining a new erase function
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	373	in a strictly structure-preserving manner:
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	374	\begin{center}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	375	\begin{tabular}{lcl}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	376	$\textit{rrexp}$ & $::=$ & $\ldots\; \|\; (\sum \_ ) \; ::\; "\textit{rrexp} \; list \Rightarrow \textit{rrexp}" \| \ldots$\\
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	377	$\rerase{_{bs}\sum as}$ & $\dn$ & $\RALTS{\map \; \rerase{\_} \; as}$\\
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	378	\end{tabular}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	379	\end{center}
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	380	\noindent
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	381	%But
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	382	%Everything about the structure remains intact.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	383	%Therefore it does not change the size
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	384	%of an annotated regular expression and we have:
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	385	\noindent
594 62f8fa03863e more Chengsong parents: 593 diff changeset	386	One might be able to prove an inequality such as
62f8fa03863e more Chengsong parents: 593 diff changeset	387	$\llbracket a \rrbracket \leq \llbracket a_\downarrow \rrbracket_p $
62f8fa03863e more Chengsong parents: 593 diff changeset	388	and then estimate $\llbracket a_\downarrow \rrbracket_p$,
62f8fa03863e more Chengsong parents: 593 diff changeset	389	but we found our approach more straightforward.\\
532 cc54ce075db5 restructured Chengsong parents: diff changeset	390
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	391	\subsection{Functions for R-regular Expressions}
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	392	The downside of our approach is that we need to redefine
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	393	several functions for $\rrexp$.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	394	In this section we shall define the r-regular expression version
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	395	of $\bder$, and $\textit{bsimp}$ related functions.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	396	We use $r$ as the prefix or subscript to differentiate
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	397	with the bitcoded version.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	398	%For example,$\backslash_r$, $\rdistincts$, and $\rsimp$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	399	%as opposed to $\backslash$, $\distinctBy$, and $\bsimp$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	400	%As promised, they are much simpler than their bitcoded counterparts.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	401	%The operations on r-regular expressions are
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	402	%almost identical to those of the annotated regular expressions,
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	403	%except that no bitcodes are used. For example,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	404	The derivative operation for an r-regular expression is\\
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	405	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	406	\begin{tabular}{@{}lcl@{}}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	407	$(\ZERO)\,\backslash_r c$ & $\dn$ & $\ZERO$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	408	$(\ONE)\,\backslash_r c$ & $\dn$ &
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	409	$\textit{if}\;c=d\; \;\textit{then}\;
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	410	\ONE\;\textit{else}\;\ZERO$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	411	$(\sum \;\textit{rs})\,\backslash_r c$ & $\dn$ &
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	412	$\sum\;(\textit{map} \; (\_\backslash_r c) \; rs )$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	413	$(r_1\cdot r_2)\,\backslash_r c$ & $\dn$ &
594 62f8fa03863e more Chengsong parents: 593 diff changeset	414	$\textit{if}\;(\textit{rnullable}\,r_1)$\\
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	415	& &$\textit{then}\;\sum\,[(r_1\,\backslash_r c)\cdot\,r_2,$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	416	& &$\phantom{\textit{then},\;\sum\,}((r_2\,\backslash_r c))]$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	417	& &$\textit{else}\;\,(r_1\,\backslash_r c)\cdot r_2$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	418	$(r^*)\,\backslash_r c$ & $\dn$ &
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	419	$( r\,\backslash_r c)\cdot
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	420	(_{[]}r^*))$
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	421	\end{tabular}
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	422	\end{center}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	423	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	424	where we omit the definition of $\textit{rnullable}$.
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	425	The generalisation from the derivatives w.r.t a character to
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	426	derivatives w.r.t strings is given as
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	427	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	428	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	429	$r \backslash_{rs} []$ & $\dn$ & $r$\\
ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	430	$r \backslash_{rs} c::s$ & $\dn$ & $(r\backslash_r c) \backslash_{rs} s$
ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	431	\end{tabular}
ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	432	\end{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	433
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	434	The function $\distinctBy$ for r-regular expressions does not need
594 62f8fa03863e more Chengsong parents: 593 diff changeset	435	a function checking equivalence because
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	436	there are no bit annotations.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	437	Therefore we have
532 cc54ce075db5 restructured Chengsong parents: diff changeset	438	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	439	\begin{tabular}{lcl}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	440	$\rdistinct{[]}{rset} $ & $\dn$ & $[]$\\
594 62f8fa03863e more Chengsong parents: 593 diff changeset	441	$\rdistinct{r :: rs}{rset}$ & $\dn$ &
62f8fa03863e more Chengsong parents: 593 diff changeset	442	$\textit{if}(r \in \textit{rset}) \; \textit{then} \; \rdistinct{rs}{rset}$\\
62f8fa03863e more Chengsong parents: 593 diff changeset	443	& & $\textit{else}\; \;
62f8fa03863e more Chengsong parents: 593 diff changeset	444	r::\rdistinct{rs}{(rset \cup \{r\})}$
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	445	\end{tabular}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	446	\end{center}
cc54ce075db5 restructured Chengsong parents: diff changeset	447	%TODO: definition of rsimp (maybe only the alternative clause)
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	448	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	449	%We would like to make clear
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	450	%a difference between our $\rdistincts$ and
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	451	%the Isabelle $\textit {distinct}$ predicate.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	452	%In Isabelle $\textit{distinct}$ is a function that returns a boolean
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	453	%rather than a list.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	454	%It tests if all the elements of a list are unique.\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	455	With $\textit{rdistinct}$ in place,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	456	the flatten function for $\rrexp$ is as follows:
595 fa92124d1fb7 more Chengsong parents: 594 diff changeset	457	\begin{center}
fa92124d1fb7 more Chengsong parents: 594 diff changeset	458	\begin{tabular}{@{}lcl@{}}
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	459	$\textit{rflts} \; (\sum \textit{as}) :: \textit{as'}$ & $\dn$ & $as \; @ \; \textit{rflts} \; as' $ \\
595 fa92124d1fb7 more Chengsong parents: 594 diff changeset	460	$\textit{rflts} \; \ZERO :: as'$ & $\dn$ & $ \textit{rflts} \; \textit{as'} $ \\
fa92124d1fb7 more Chengsong parents: 594 diff changeset	461	$\textit{rflts} \; a :: as'$ & $\dn$ & $a :: \textit{rflts} \; \textit{as'}$ \quad(otherwise)
fa92124d1fb7 more Chengsong parents: 594 diff changeset	462	\end{tabular}
fa92124d1fb7 more Chengsong parents: 594 diff changeset	463	\end{center}
fa92124d1fb7 more Chengsong parents: 594 diff changeset	464	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	465	The function
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	466	$\rsimpalts$ corresponds to $\textit{bsimp}_{ALTS}$:
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	467	\begin{center}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	468	\begin{tabular}{@{}lcl@{}}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	469	$\rsimpalts \;\; nil$ & $\dn$ & $\RZERO$\\
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	470	$\rsimpalts \;\; r::nil$ & $\dn$ & $r$\\
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	471	$\rsimpalts \;\; rs$ & $\dn$ & $\sum rs$\\
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	472	\end{tabular}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	473	\end{center}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	474	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	475	Similarly, we have $\rsimpseq$ which corresponds to $\textit{bsimp}_{SEQ}$:
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	476	\begin{center}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	477	\begin{tabular}{@{}lcl@{}}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	478	$\rsimpseq \;\; \RZERO \; \_ $ & $=$ & $\RZERO$\\
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	479	$\rsimpseq \;\; \_ \; \RZERO $ & $=$ & $\RZERO$\\
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	480	$\rsimpseq \;\; \RONE \cdot r_2$ & $\dn$ & $r_2$\\
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	481	$\rsimpseq \;\; r_1 r_2$ & $\dn$ & $r_1 \cdot r_2$\\
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	482	\end{tabular}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	483	\end{center}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	484	and get $\textit{rsimp}$ and $\rderssimp{\_}{\_}$:
595 fa92124d1fb7 more Chengsong parents: 594 diff changeset	485	\begin{center}
fa92124d1fb7 more Chengsong parents: 594 diff changeset	486	\begin{tabular}{@{}lcl@{}}
fa92124d1fb7 more Chengsong parents: 594 diff changeset	487
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	488	$\textit{rsimp} \; (r_1\cdot r_2)$ & $\dn$ & $ \textit{rsimp}_{SEQ} \; bs \;(\textit{rsimp} \; r_1) \; (\textit{rsimp} \; r_2) $ \\
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	489	$\textit{rsimp} \; (_{bs}\sum \textit{rs})$ & $\dn$ & $\textit{rsimp}_{ALTS} \; \textit{bs} \; (\textit{rdistinct} \; ( \textit{rflts} ( \textit{map} \; rsimp \; rs)) \; \rerases \; \varnothing) $ \\
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	490	$\textit{rsimp} \; r$ & $\dn$ & $\textit{r} \qquad \textit{otherwise}$
595 fa92124d1fb7 more Chengsong parents: 594 diff changeset	491	\end{tabular}
fa92124d1fb7 more Chengsong parents: 594 diff changeset	492	\end{center}
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	493	\begin{center}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	494	\begin{tabular}{@{}lcl@{}}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	495	$r\backslash_{rsimp} \, c$ & $\dn$ & $\rsimp \; (r\backslash_r \, c)$
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	496	\end{tabular}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	497	\end{center}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	498
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	499	\begin{center}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	500	\begin{tabular}{@{}lcl@{}}
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	501	$r \backslash_{rsimps} \; \; c\!::\!s $ & $\dn$ & $(r \backslash_{rsimp}\, c) \backslash_{rsimps}\, s$ \\
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	502	$r \backslash_{rsimps} [\,] $ & $\dn$ & $r$
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	503	\end{tabular}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	504	\end{center}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	505	\noindent
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	506	We do not define an r-regular expression version of $\blexersimp$,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	507	as our proof does not depend on it.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	508	Now we are ready to introduce how r-regular expressions allow
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	509	us to prove the size bound on bitcoded regular expressions.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	510
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	511	\subsection{Using R-regular Expressions to Bound Bit-coded Regular Expressions}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	512	Everything about the size of annotated regular expressions after the application
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	513	of function $\bsimp$ and $\backslash_{simps}$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	514	can be calculated via the size of r-regular expressions after the application
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	515	of $\rsimp$ and $\backslash_{rsimps}$:
564 3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	516	\begin{lemma}\label{sizeRelations}
659 2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	517	The following equalities hold:
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	518	\begin{itemize}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	519	\item
659 2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	520	$\rsize{\rerase a} = \asize a$
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	521	\item
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	522	$\asize{\bsimps \; a} = \rsize{\rsimp{ \rerase{a}}}$
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	523	\item
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	524	$\asize{\bderssimp{a}{s}} = \rsize{\rderssimp{\rerase{a}}{s}}$
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	525	\end{itemize}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	526	\end{lemma}
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	527	\begin{proof}
659 2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	528	First part follows from the definition of $(\_)_{\downarrow_r}$.
2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	529	The second part is by induction on the inductive cases
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	530	of $\textit{bsimp}$.
659 2e05f04ed6b3 Addressed Gerog "can't understand 'erase messes with structure'" comment Chengsong parents: 640 diff changeset	531	The third part is by induction on the string $s$,
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	532	where the inductive step follows from part one.
ce4e5151a836 more Chengsong parents: 596 diff changeset	533	\end{proof}
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	534	\noindent
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	535	With lemma \ref{sizeRelations},
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	536	we will be able to focus on
ce4e5151a836 more Chengsong parents: 596 diff changeset	537	estimating only
ce4e5151a836 more Chengsong parents: 596 diff changeset	538	$\rsize{\rderssimp{\rerase{a}}{s}}$
ce4e5151a836 more Chengsong parents: 596 diff changeset	539	in later parts because
ce4e5151a836 more Chengsong parents: 596 diff changeset	540	\begin{center}
ce4e5151a836 more Chengsong parents: 596 diff changeset	541	$\rsize{\rderssimp{\rerase{a}}{s}} \leq N_r \quad$
ce4e5151a836 more Chengsong parents: 596 diff changeset	542	implies
ce4e5151a836 more Chengsong parents: 596 diff changeset	543	$\quad \llbracket a \backslash_{bsimps} s \rrbracket \leq N_r$.
ce4e5151a836 more Chengsong parents: 596 diff changeset	544	\end{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	545	%From now on we
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	546	%Unless stated otherwise in the rest of this
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	547	%chapter all regular expressions without
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	548	%bitcodes are seen as r-regular expressions ($\rrexp$s).
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	549	%For the binary alternative r-regular expression $\RALTS{[r_1, r_2]}$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	550	%we use the notation $r_1 + r_2$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	551	%for brevity.
532 cc54ce075db5 restructured Chengsong parents: diff changeset	552
cc54ce075db5 restructured Chengsong parents: diff changeset	553
cc54ce075db5 restructured Chengsong parents: diff changeset	554	%-----------------------------------
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	555	% SUB SECTION ROADMAP RREXP BOUND
532 cc54ce075db5 restructured Chengsong parents: diff changeset	556	%-----------------------------------
553 0f00d440f484 more changes Chengsong parents: 543 diff changeset	557
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	558	%\subsection{Roadmap to a Bound for $\textit{Rrexp}$}
553 0f00d440f484 more changes Chengsong parents: 543 diff changeset	559
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	560	%The way we obtain the bound for $\rrexp$s is by two steps:
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	561	%\begin{itemize}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	562	% \item
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	563	% First, we rewrite $r\backslash s$ into something else that is easier
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	564	% to bound. This step is crucial for the inductive case
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	565	% $r_1 \cdot r_2$ and $r^*$, where the derivative can grow and bloat in a wild way,
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	566	% but after simplification, they will always be equal or smaller to a form consisting of an alternative
596 b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	567	% list of regular expressions $f \; (g\; (\sum rs))$ with some functions applied to it, where each element will be distinct after the function application.
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	568	% \item
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	569	% Then, for such a sum list of regular expressions $f\; (g\; (\sum rs))$, we can control its size
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	570	% by estimation, since $\distinctBy$ and $\flts$ are well-behaved and working together would only
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	571	% reduce the size of a regular expression, not adding to it.
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	572	%\end{itemize}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	573	%
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	574	%\section{Step One: Closed Forms}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	575	%We transform the function application $\rderssimp{r}{s}$
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	576	%into an equivalent
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	577	%form $f\; (g \; (\sum rs))$.
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	578	%The functions $f$ and $g$ can be anything from $\flts$, $\distinctBy$ and other helper functions from $\bsimp{\_}$.
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	579	%This way we get a different but equivalent way of expressing : $r\backslash s = f \; (g\; (\sum rs))$, we call the
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	580	%right hand side the "closed form" of $r\backslash s$.
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	581	%
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	582	%\begin{quote}\it
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	583	% Claim: For regular expressions $r_1 \cdot r_2$, we claim that
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	584	%\end{quote}
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	585	%\noindent
b306628a0eab more chap 56 Chengsong parents: 595 diff changeset	586	%We explain in detail how we reached those claims.
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	587	If we attempt to prove
ce4e5151a836 more Chengsong parents: 596 diff changeset	588	\begin{center}
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	589	$\forall r. \; \exists N_r.\;\; s.t. \llbracket r\backslash_{rsimps} s \rrbracket_r \leq N_r$
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	590	\end{center}
ce4e5151a836 more Chengsong parents: 596 diff changeset	591	using a naive induction on the structure of $r$,
ce4e5151a836 more Chengsong parents: 596 diff changeset	592	then we are stuck at the inductive cases such as
ce4e5151a836 more Chengsong parents: 596 diff changeset	593	$r_1\cdot r_2$.
ce4e5151a836 more Chengsong parents: 596 diff changeset	594	The inductive hypotheses are:
ce4e5151a836 more Chengsong parents: 596 diff changeset	595	\begin{center}
ce4e5151a836 more Chengsong parents: 596 diff changeset	596	1: $\text{for } r_1, \text{there exists } N_{r_1}.\;\; s.t.
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	597	\;\;\forall s. \llbracket r_1 \backslash_{rsimps} s \rrbracket_r \leq N_{r_1}. $\\
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	598	2: $\text{for } r_2, \text{there exists } N_{r_2}.\;\; s.t.
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	599	\;\; \forall s. \llbracket r_2 \backslash_{rsimps} s \rrbracket_r \leq N_{r_2}. $
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	600	\end{center}
ce4e5151a836 more Chengsong parents: 596 diff changeset	601	The inductive step to prove would be
ce4e5151a836 more Chengsong parents: 596 diff changeset	602	\begin{center}
ce4e5151a836 more Chengsong parents: 596 diff changeset	603	$\text{there exists } N_{r_1\cdot r_2}. \;\; s.t. \forall s.
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	604	\llbracket (r_1 \cdot r_2) \backslash_{rsimps} s \rrbracket_r \leq N_{r_1\cdot r_2}.$
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	605	\end{center}
ce4e5151a836 more Chengsong parents: 596 diff changeset	606	The problem is that it is not clear what
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	607	$(r_1\cdot r_2) \backslash_{rsimps} s$ looks like,
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	608	and therefore $N_{r_1}$ and $N_{r_2}$ in the
ce4e5151a836 more Chengsong parents: 596 diff changeset	609	inductive hypotheses cannot be directly used.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	610	%We have already seen that $(r_1 \cdot r_2)\backslash s$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	611	%and $(r^*)\backslash s$ can grow in a wild way.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	612
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	613	The point however, is that they will be equivalent to a list of
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	614	terms $\sum rs$, where each term in $rs$ will
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	615	be made of $r_1 \backslash s' $, $r_2\backslash s'$,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	616	and $r \backslash s'$ with $s' \in \textit{SubString} \; s$ (which stands
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	617	for the set of substrings of $s$).
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	618	The list $\sum rs$ will then be de-duplicated by $\textit{rdistinct}$
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	619	in the simplification, which prevents the $rs$ from growing indefinitely.
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	620
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	621	Based on this idea, we develop a proof in two steps.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	622	First, we show the below equality (where
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	623	$f$ and $g$ are functions that do not increase the size of the input)
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	624	\begin{center}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	625	$r\backslash_{rsimps} s = f\; (\textit{rdistinct} \; (g\; \sum rs))$,
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	626	\end{center}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	627	where $r = r_1 \cdot r_2$ or $r = r_0^*$ and so on.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	628	For example, for $r_1 \cdot r_2$ we have the equality as
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	629	\begin{center}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	630	$ \rderssimp{r_1 \cdot r_2}{s} =
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	631	\rsimp{(\sum (r_1 \backslash s \cdot r_2 ) \; :: \;(\map \; \rderssimp{r_2}{\_} \;(\vsuf{s}{r_1})))}$
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	632	\end{center}
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	633	We call the right-hand-side the
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	634	\emph{Closed Form} of $(r_1 \cdot r_2)\backslash_{rsimps} s$.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	635	Second, we will bound the closed form of r-regular expressions
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	636	using some estimation techniques
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	637	and then apply
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	638	lemma \ref{sizeRelations} to show that the bitcoded regular expressions
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	639	in our $\blexersimp$ are finitely bounded.
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	640
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	641	We will describe in detail the first step of the proof
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	642	in the next section.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	643
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	644	\section{Closed Forms}
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	645	In this section we introduce in detail
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	646	how to express the string derivatives
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	647	of regular expressions (i.e. $r \backslash_r s$ where $s$ is a string
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	648	rather than a single character) in a different way than
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	649	our previous definition.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	650	In previous chapters, the derivative of a
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	651	regular expression $r$ w.r.t a string $s$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	652	was recursively defined on the string:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	653	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	654	$r \backslash_s (c::s) \dn (r \backslash c) \backslash_s s$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	655	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	656	The problem is that
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	657	this definition does not provide much information
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	658	on what $r \backslash_s s$ looks like.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	659	If we are interested in the size of a derivative like
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	660	$(r_1 \cdot r_2)\backslash s$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	661	we have to somehow get a more concrete form to begin.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	662	We call such more concrete representations the ``closed forms'' of
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	663	string derivatives as opposed to their original definitions.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	664	The terminology ``closed form'' is borrowed from mathematics,
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	665	which usually describe expressions that are solely comprised of finitely many
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	666	well-known and easy-to-compute operations such as
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	667	additions, multiplications, and exponential functions.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	668
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	669	We start by proving some basic identities
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	670	involving the simplification functions for r-regular expressions.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	671	After that we introduce the rewrite relations
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	672	$\rightsquigarrow_h$, $\rightsquigarrow^*_{scf}$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	673	$\rightsquigarrow_f$ and $\rightsquigarrow_g$.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	674	These relations involve similar techniques as in chapter \ref{Bitcoded2}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	675	for annotated regular expressions.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	676	Finally, we use these identities to establish the
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	677	closed forms of the alternative regular expression,
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	678	the sequence regular expression, and the star regular expression.
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	679	%$r_1\cdot r_2$, $r^*$ and $\sum rs$.
601 ce4e5151a836 more Chengsong parents: 596 diff changeset	680
ce4e5151a836 more Chengsong parents: 596 diff changeset	681
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	682
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	683	\subsection{Some Basic Identities}
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	684
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	685	In what follows we will often convert between lists
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	686	and sets.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	687	We use Isabelle's $set$ to refer to the
611 bc1df466150a more Chengsong parents: 610 diff changeset	688	function that converts a list $rs$ to the set
bc1df466150a more Chengsong parents: 610 diff changeset	689	containing all the elements in $rs$.
bc1df466150a more Chengsong parents: 610 diff changeset	690	\subsubsection{$\textit{rdistinct}$'s Does the Job of De-duplication}
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	691	The $\textit{rdistinct}$ function, as its name suggests, will
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	692	de-duplicate an r-regular expression list.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	693	It will also remove any elements that
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	694	are already in the accumulator set.
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	695	\begin{lemma}\label{rdistinctDoesTheJob}
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	696	%The function $\textit{rdistinct}$ satisfies the following
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	697	%properties:
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	698	Assume we have the predicate $\textit{isDistinct}$\footnote{We omit its
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	699	recursive definition here. Its Isabelle counterpart would be $\textit{distinct}$.}
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	700	for testing
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	701	whether a list's elements are unique. Then the following
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	702	properties about $\textit{rdistinct}$ hold:
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	703	\begin{itemize}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	704	\item
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	705	If $a \in acc$ then $a \notin (\rdistinct{rs}{acc})$.
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	706	\item
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	707	%If list $rs'$ is the result of $\rdistinct{rs}{acc}$,
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	708	$\textit{isDistinct} \;\;\; (\rdistinct{rs}{acc})$.
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	709	\item
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	710	$\textit{set} \; (\rdistinct{rs}{acc})
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	711	= (\textit{set} \; rs) - acc$
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	712	\end{itemize}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	713	\end{lemma}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	714	\noindent
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	715	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	716	The first part is by an induction on $rs$.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	717	The second and third parts can be proven by using the
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	718	inductive cases of $\textit{rdistinct}$.
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	719
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	720	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	721
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	722	\noindent
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	723	%$\textit{rdistinct}$ will out all regular expression terms
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	724	%that are in the accumulator, therefore
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	725	Concatenating a list $rs_a$ at the front of another
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	726	list $rs$ whose elements are all from the accumulator, and then calling $\textit{rdistinct}$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	727	on the merged list, the output will be as if we had called $\textit{rdistinct}$
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	728	without the prepending of $rs$:
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	729	\begin{lemma}\label{rdistinctConcat}
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	730	The elements appearing in the accumulator will always be removed.
15d182ffbc76 more Chengsong parents: 553 diff changeset	731	More precisely,
15d182ffbc76 more Chengsong parents: 553 diff changeset	732	\begin{itemize}
15d182ffbc76 more Chengsong parents: 553 diff changeset	733	\item
15d182ffbc76 more Chengsong parents: 553 diff changeset	734	If $rs \subseteq rset$, then
15d182ffbc76 more Chengsong parents: 553 diff changeset	735	$\rdistinct{rs@rsa }{acc} = \rdistinct{rsa }{acc}$.
15d182ffbc76 more Chengsong parents: 553 diff changeset	736	\item
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	737	More generally, if $a \in rset$ and $\rdistinct{rs}{\{a\}} = []$,
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	738	then $\rdistinct{(rs @ rs')}{rset} = \rdistinct{rs'}{rset}$
15d182ffbc76 more Chengsong parents: 553 diff changeset	739	\end{itemize}
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	740	\end{lemma}
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	741
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	742	\begin{proof}
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	743	By induction on $rs$ and using \ref{rdistinctDoesTheJob}.
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	744	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	745	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	746	On the other hand, if an element $r$ does not appear in the input list waiting to be deduplicated,
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	747	then expanding the accumulator to include that element will not cause the output list to change:
611 bc1df466150a more Chengsong parents: 610 diff changeset	748	\begin{lemma}\label{rdistinctOnDistinct}
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	749	The accumulator can be augmented to include elements not appearing in the input list,
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	750	and the output will not change.
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	751	\begin{itemize}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	752	\item
611 bc1df466150a more Chengsong parents: 610 diff changeset	753	If $r \notin rs$, then $\rdistinct{rs}{acc} = \rdistinct{rs}{(\{r\} \cup acc)}$.
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	754	\item
611 bc1df466150a more Chengsong parents: 610 diff changeset	755	Particularly, if $\;\;\textit{isDistinct} \; rs$, then we have\\
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	756	\[ \rdistinct{rs}{\varnothing} = rs \]
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	757	\end{itemize}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	758	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	759	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	760	The first half is by induction on $rs$. The second half is a corollary of the first.
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	761	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	762	\noindent
611 bc1df466150a more Chengsong parents: 610 diff changeset	763	The function $\textit{rdistinct}$ removes duplicates from anywhere in a list.
bc1df466150a more Chengsong parents: 610 diff changeset	764	Despite being seemingly obvious,
bc1df466150a more Chengsong parents: 610 diff changeset	765	the induction technique is not as straightforward.
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	766	\begin{lemma}\label{distinctRemovesMiddle}
15d182ffbc76 more Chengsong parents: 553 diff changeset	767	The two properties hold if $r \in rs$:
15d182ffbc76 more Chengsong parents: 553 diff changeset	768	\begin{itemize}
15d182ffbc76 more Chengsong parents: 553 diff changeset	769	\item
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	770	$\rdistinct{rs}{rset} = \rdistinct{(rs @ [r])}{rset}$\\
aecf1ddf3541 more Chengsong parents: 554 diff changeset	771	and\\
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	772	$\rdistinct{(ab :: rs @ [ab])}{rset'} = \rdistinct{(ab :: rs)}{rset'}$
15d182ffbc76 more Chengsong parents: 553 diff changeset	773	\item
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	774	$\rdistinct{ (rs @ rs') }{rset} = \rdistinct{rs @ [r] @ rs'}{rset}$\\
aecf1ddf3541 more Chengsong parents: 554 diff changeset	775	and\\
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	776	$\rdistinct{(ab :: rs @ [ab] @ rs'')}{rset'} =
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	777	\rdistinct{(ab :: rs @ rs'')}{rset'}$
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	778	\end{itemize}
15d182ffbc76 more Chengsong parents: 553 diff changeset	779	\end{lemma}
15d182ffbc76 more Chengsong parents: 553 diff changeset	780	\noindent
15d182ffbc76 more Chengsong parents: 553 diff changeset	781	\begin{proof}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	782	By induction on $rs$. All other variables are allowed to be arbitrary.
611 bc1df466150a more Chengsong parents: 610 diff changeset	783	The second part of the lemma requires the first.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	784	Note that for each part, the two sub-propositions need to be proven
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	785	at the same time,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	786	so that the induction goes through.
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	787	\end{proof}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	788	\noindent
611 bc1df466150a more Chengsong parents: 610 diff changeset	789	This allows us to prove a few more equivalence relations involving
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	790	$\textit{rdistinct}$ (they will be useful later):
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	791	\begin{lemma}\label{rdistinctConcatGeneral}
611 bc1df466150a more Chengsong parents: 610 diff changeset	792	\mbox{}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	793	\begin{itemize}
aecf1ddf3541 more Chengsong parents: 554 diff changeset	794	\item
aecf1ddf3541 more Chengsong parents: 554 diff changeset	795	$\rdistinct{(rs @ rs')}{\varnothing} = \rdistinct{((\rdistinct{rs}{\varnothing})@ rs')}{\varnothing}$
aecf1ddf3541 more Chengsong parents: 554 diff changeset	796	\item
aecf1ddf3541 more Chengsong parents: 554 diff changeset	797	$\rdistinct{(rs @ rs')}{\varnothing} = \rdistinct{(\rdistinct{rs}{\varnothing} @ rs')}{\varnothing}$
aecf1ddf3541 more Chengsong parents: 554 diff changeset	798	\item
aecf1ddf3541 more Chengsong parents: 554 diff changeset	799	If $rset' \subseteq rset$, then $\rdistinct{rs}{rset} =
aecf1ddf3541 more Chengsong parents: 554 diff changeset	800	\rdistinct{(\rdistinct{rs}{rset'})}{rset}$. As a corollary
aecf1ddf3541 more Chengsong parents: 554 diff changeset	801	of this,
aecf1ddf3541 more Chengsong parents: 554 diff changeset	802	\item
aecf1ddf3541 more Chengsong parents: 554 diff changeset	803	$\rdistinct{(rs @ rs')}{rset} = \rdistinct{
aecf1ddf3541 more Chengsong parents: 554 diff changeset	804	(\rdistinct{rs}{\varnothing}) @ rs')}{rset}$. This
aecf1ddf3541 more Chengsong parents: 554 diff changeset	805	gives another corollary use later:
aecf1ddf3541 more Chengsong parents: 554 diff changeset	806	\item
aecf1ddf3541 more Chengsong parents: 554 diff changeset	807	If $a \in rset$, then $\rdistinct{(rs @ rs')}{rset} = \rdistinct{
aecf1ddf3541 more Chengsong parents: 554 diff changeset	808	(\rdistinct{(a :: rs)}{\varnothing} @ rs')}{rset} $,
aecf1ddf3541 more Chengsong parents: 554 diff changeset	809
aecf1ddf3541 more Chengsong parents: 554 diff changeset	810	\end{itemize}
aecf1ddf3541 more Chengsong parents: 554 diff changeset	811	\end{lemma}
aecf1ddf3541 more Chengsong parents: 554 diff changeset	812	\begin{proof}
aecf1ddf3541 more Chengsong parents: 554 diff changeset	813	By \ref{rdistinctDoesTheJob} and \ref{distinctRemovesMiddle}.
aecf1ddf3541 more Chengsong parents: 554 diff changeset	814	\end{proof}
611 bc1df466150a more Chengsong parents: 610 diff changeset	815	\noindent
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	816	The next lemma is a more general form of \ref{rdistinctConcat};
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	817	It says that
611 bc1df466150a more Chengsong parents: 610 diff changeset	818	$\textit{rdistinct}$ is composable w.r.t list concatenation:
bc1df466150a more Chengsong parents: 610 diff changeset	819	\begin{lemma}\label{distinctRdistinctAppend}
bc1df466150a more Chengsong parents: 610 diff changeset	820	If $\;\; \textit{isDistinct} \; rs_1$,
bc1df466150a more Chengsong parents: 610 diff changeset	821	and $(set \; rs_1) \cap acc = \varnothing$,
bc1df466150a more Chengsong parents: 610 diff changeset	822	then applying $\textit{rdistinct}$ on $rs_1 @ rs_a$ does not
bc1df466150a more Chengsong parents: 610 diff changeset	823	have an effect on $rs_1$:
bc1df466150a more Chengsong parents: 610 diff changeset	824	\[\textit{rdistinct}\; (rs_1 @ rsa)\;\, acc
bc1df466150a more Chengsong parents: 610 diff changeset	825	= rs_1@(\textit{rdistinct} rsa \; (acc \cup rs_1))\]
bc1df466150a more Chengsong parents: 610 diff changeset	826	\end{lemma}
bc1df466150a more Chengsong parents: 610 diff changeset	827	\begin{proof}
bc1df466150a more Chengsong parents: 610 diff changeset	828	By an induction on
bc1df466150a more Chengsong parents: 610 diff changeset	829	$rs_1$, where $rsa$ and $acc$ are allowed to be arbitrary.
bc1df466150a more Chengsong parents: 610 diff changeset	830	\end{proof}
bc1df466150a more Chengsong parents: 610 diff changeset	831	\noindent
bc1df466150a more Chengsong parents: 610 diff changeset	832	$\textit{rdistinct}$ needs to be applied only once, and
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	833	applying it multiple times does not make any difference:
611 bc1df466150a more Chengsong parents: 610 diff changeset	834	\begin{corollary}\label{distinctOnceEnough}
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	835	$\textit{rdistinct} \; (rs @ rsa) {} = \textit{rdistinct} \; ( (rdistinct \;
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	836	rs \; \{ \}) @ (\textit{rdistinct} \; rs_a \; (set \; rs)))$
611 bc1df466150a more Chengsong parents: 610 diff changeset	837	\end{corollary}
bc1df466150a more Chengsong parents: 610 diff changeset	838	\begin{proof}
bc1df466150a more Chengsong parents: 610 diff changeset	839	By lemma \ref{distinctRdistinctAppend}.
bc1df466150a more Chengsong parents: 610 diff changeset	840	\end{proof}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	841
611 bc1df466150a more Chengsong parents: 610 diff changeset	842	\subsubsection{The Properties of $\textit{Rflts}$}
bc1df466150a more Chengsong parents: 610 diff changeset	843	We give in this subsection some properties
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	844	involving $\backslash_r$, $\backslash_{rsimps}$, $\textit{rflts}$ and
611 bc1df466150a more Chengsong parents: 610 diff changeset	845	$\textit{rsimp}_{ALTS} $, together with any non-trivial lemmas that lead to them.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	846	These will be helpful in later closed-form proofs, when
611 bc1df466150a more Chengsong parents: 610 diff changeset	847	we want to transform derivative terms which have
bc1df466150a more Chengsong parents: 610 diff changeset	848	%the ways in which multiple functions involving
bc1df466150a more Chengsong parents: 610 diff changeset	849	%those are composed together
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	850	interleaving derivatives and simplifications applied to them.
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	851
611 bc1df466150a more Chengsong parents: 610 diff changeset	852	\noindent
bc1df466150a more Chengsong parents: 610 diff changeset	853	%When the function $\textit{Rflts}$
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	854	%is applied to the concatenation of two lists; the output can be calculated by first applying the
bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	855	%functions on two lists separately and then concatenating them together.
611 bc1df466150a more Chengsong parents: 610 diff changeset	856	$\textit{Rflts}$ is composable in terms of concatenation:
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	857	\begin{lemma}\label{rfltsProps}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	858	The function $\rflts$ has the properties below:\\
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	859	\begin{itemize}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	860	\item
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	861	$\rflts \; (rs_1 @ rs_2) = \rflts \; rs_1 @ \rflts \; rs_2$
15d182ffbc76 more Chengsong parents: 553 diff changeset	862	\item
15d182ffbc76 more Chengsong parents: 553 diff changeset	863	If $r \neq \RZERO$ and $\nexists rs_1. r = \RALTS{rs}_1$, then $\rflts \; (r::rs) = r :: \rflts \; rs$
15d182ffbc76 more Chengsong parents: 553 diff changeset	864	\item
15d182ffbc76 more Chengsong parents: 553 diff changeset	865	$\rflts \; (rs @ [\RZERO]) = \rflts \; rs$
15d182ffbc76 more Chengsong parents: 553 diff changeset	866	\item
15d182ffbc76 more Chengsong parents: 553 diff changeset	867	$\rflts \; (rs' @ [\RALTS{rs}]) = \rflts \; rs'@rs$
15d182ffbc76 more Chengsong parents: 553 diff changeset	868	\item
15d182ffbc76 more Chengsong parents: 553 diff changeset	869	$\rflts \; (rs @ [\RONE]) = \rflts \; rs @ [\RONE]$
15d182ffbc76 more Chengsong parents: 553 diff changeset	870	\item
15d182ffbc76 more Chengsong parents: 553 diff changeset	871	If $r \neq \RZERO$ and $\nexists rs'. r = \RALTS{rs'}$ then $\rflts \; (rs @ [r])
15d182ffbc76 more Chengsong parents: 553 diff changeset	872	= (\rflts \; rs) @ [r]$
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	873	\item
aecf1ddf3541 more Chengsong parents: 554 diff changeset	874	If $r = \RALTS{rs}$ and $r \in rs'$ then for all $r_1 \in rs.
aecf1ddf3541 more Chengsong parents: 554 diff changeset	875	r_1 \in \rflts \; rs'$.
aecf1ddf3541 more Chengsong parents: 554 diff changeset	876	\item
aecf1ddf3541 more Chengsong parents: 554 diff changeset	877	$\rflts \; (rs_a @ \RZERO :: rs_b) = \rflts \; (rs_a @ rs_b)$
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	878	\end{itemize}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	879	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	880	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	881	\begin{proof}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	882	By induction on $rs_1$ in the first sub-lemma, and induction on $r$ in the second part,
aecf1ddf3541 more Chengsong parents: 554 diff changeset	883	and induction on $rs$, $rs'$, $rs$, $rs'$, $rs_a$ in the third, fourth, fifth, sixth and
aecf1ddf3541 more Chengsong parents: 554 diff changeset	884	last sub-lemma.
543 b2bea5968b89 thesis_thys Chengsong parents: 532 diff changeset	885	\end{proof}
611 bc1df466150a more Chengsong parents: 610 diff changeset	886	\noindent
bc1df466150a more Chengsong parents: 610 diff changeset	887	Now we introduce the property that the operations
bc1df466150a more Chengsong parents: 610 diff changeset	888	derivative and $\rsimpalts$
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	889	commute, this will be used later on when deriving the closed form for
611 bc1df466150a more Chengsong parents: 610 diff changeset	890	the alternative regular expression:
bc1df466150a more Chengsong parents: 610 diff changeset	891	\begin{lemma}\label{rderRsimpAltsCommute}
bc1df466150a more Chengsong parents: 610 diff changeset	892	$\rder{x}{(\rsimpalts \; rs)} = \rsimpalts \; (\map \; (\rder{x}{\_}) \; rs)$
bc1df466150a more Chengsong parents: 610 diff changeset	893	\end{lemma}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	894	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	895	By induction on $rs$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	896	\end{proof}
611 bc1df466150a more Chengsong parents: 610 diff changeset	897	\noindent
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	898
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	899	\subsubsection{The $RL$ Function: Language Interpretation for $\textit{Rrexp}$s}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	900	Much like the definition of $L$ on plain regular expressions, one can also
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	901	define the language interpretation for $\rrexp$s.
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	902	\begin{center}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	903	\begin{tabular}{lcl}
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	904	$RL \; (\ZERO_r)$ & $\dn$ & $\phi$\\
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	905	$RL \; (\ONE_r)$ & $\dn$ & $\{[]\}$\\
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	906	$RL \; (c)$ & $\dn$ & $\{[c]\}$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	907	$RL \; \sum rs$ & $\dn$ & $ \bigcup_{r \in rs} (RL \; r)$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	908	$RL \; (r_1 \cdot r_2)$ & $\dn$ & $ RL \; (r_1) @ RL \; (r_2)$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	909	$RL \; (r^)$ & $\dn$ & $ (RL(r))^$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	910	\end{tabular}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	911	\end{center}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	912	\noindent
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	913	The main use of $RL$ is to establish some connections between $\rsimp{}$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	914	and $\rnullable{}$:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	915	\begin{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	916	The following properties hold:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	917	\begin{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	918	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	919	If $\rnullable{r}$, then $\rsimp{r} \neq \RZERO$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	920	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	921	$\rnullable{r \backslash s} \quad $ if and only if $\quad \rnullable{\rderssimp{r}{s}}$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	922	\end{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	923	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	924	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	925	The first part is by induction on $r$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	926	The second part is true because property
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	927	\[ RL \; r = RL \; (\rsimp{r})\] holds.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	928	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	929
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	930	\subsubsection{Simplified $\textit{Rrexp}$s are Good}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	931	We formalise the notion of ``good" regular expressions,
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	932	which means regular expressions that
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	933	are fully simplified in terms of our $\textit{rsimp}$ function.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	934	For alternative regular expressions that means they
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	935	do not contain any nested alternatives, un-eliminated $\RZERO$s
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	936	or duplicate elements (for example,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	937	$r_1 + (r_2 + r_3)$, $\RZERO + r$ and $ \sum [r, r, \ldots]$).
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	938	The clauses for $\good$ are:
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	939	\begin{center}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	940	\begin{tabular}{@{}lcl@{}}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	941	$\good\; \RZERO$ & $\dn$ & $\textit{false}$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	942	$\good\; \RONE$ & $\dn$ & $\textit{true}$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	943	$\good\; \RCHAR{c}$ & $\dn$ & $\btrue$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	944	$\good\; \RALTS{[]}$ & $\dn$ & $\bfalse$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	945	$\good\; \RALTS{[r]}$ & $\dn$ & $\bfalse$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	946	$\good\; \RALTS{r_1 :: r_2 :: rs}$ & $\dn$ &
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	947	$\textit{isDistinct} \; (r_1 :: r_2 :: rs) \;$\\
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	948	& & $\land \; (\forall r' \in (r_1 :: r_2 :: rs).\; \good \; r'\; \, \land \; \, \textit{nonAlt}\; r')$\\
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	949	$\good \; \RSEQ{\RZERO}{r}$ & $\dn$ & $\bfalse$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	950	$\good \; \RSEQ{\RONE}{r}$ & $\dn$ & $\bfalse$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	951	$\good \; \RSEQ{r}{\RZERO}$ & $\dn$ & $\bfalse$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	952	$\good \; \RSEQ{r_1}{r_2}$ & $\dn$ & $\good \; r_1 \;\, \textit{and} \;\, \good \; r_2$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	953	$\good \; \RSTAR{r}$ & $\dn$ & $\btrue$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	954	\end{tabular}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	955	\end{center}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	956	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	957	We omit the recursive definition of the predicate $\textit{nonAlt}$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	958	which evaluates to true when the regular expression is not an
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	959	alternative, and false otherwise.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	960	The $\good$ property is preserved under $\rsimp_{ALTS}$, provided that
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	961	its non-empty argument list of expressions are all good themselves, and $\textit{nonAlt}$,
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	962	and unique:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	963	\begin{lemma}\label{rsimpaltsGood}
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	964	If $rs \neq []$ and for all $r \in rs. \textit{nonAlt} \; r$ and $\textit{isDistinct} \; rs$,
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	965	then $\good \; (\rsimpalts \; rs)$ if and only if forall $r \in rs. \; \good \; r$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	966	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	967	\noindent
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	968	We also note that
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	969	if a regular expression $r$ is good, then $\rflts$ on the singleton
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	970	list $[r]$ will not break goodness:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	971	\begin{lemma}\label{flts2}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	972	If $\good \; r$, then forall $r' \in \rflts \; [r]. \; \good \; r'$ and $\textit{nonAlt} \; r'$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	973	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	974	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	975	By an induction on $r$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	976	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	977	\noindent
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	978	The other observation we make about $\rsimp{r}$ is that it never
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	979	comes with nested alternatives, which we describe as the $\nonnested$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	980	property:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	981	\begin{center}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	982	\begin{tabular}{lcl}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	983	$\nonnested \; \, \sum []$ & $\dn$ & $\btrue$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	984	$\nonnested \; \, \sum ((\sum rs_1) :: rs_2)$ & $\dn$ & $\bfalse$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	985	$\nonnested \; \, \sum (r :: rs)$ & $\dn$ & $\nonnested (\sum rs)$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	986	$\nonnested \; \, r $ & $\dn$ & $\btrue$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	987	\end{tabular}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	988	\end{center}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	989	\noindent
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	990	The $\rflts$ function
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	991	always opens up nested alternatives,
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	992	which enables $\rsimp$ to be non-nested:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	993
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	994	\begin{lemma}\label{nonnestedRsimp}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	995	It is always the case that
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	996	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	997	$\nonnested \; (\rsimp{r})$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	998	\end{center}
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	999	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1000	\begin{proof}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1001	By induction on $r$.
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1002	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1003	\noindent
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	1004	With this we can prove that a regular expression
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1005	after simplification and flattening and de-duplication,
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1006	will not contain any alternative regular expression directly:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1007	\begin{lemma}\label{nonaltFltsRd}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1008	If $x \in \rdistinct{\rflts\; (\map \; \rsimp{} \; rs)}{\varnothing}$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1009	then $\textit{nonAlt} \; x$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1010	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1011	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1012	By \ref{nonnestedRsimp}.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1013	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1014	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1015	The other fact we know is that once $\rsimp{}$ has finished
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1016	processing an alternative regular expression, it will not
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1017	contain any $\RZERO$s. This is because all the recursive
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1018	calls to the simplification on the children regular expressions
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1019	make the children good, and $\rflts$ will not delete
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1020	any $\RZERO$s out of a good regular expression list,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1021	and $\rdistinct{}$ will not ``mess'' with the result.
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1022	\begin{lemma}\label{flts3Obv}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1023	The following are true:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1024	\begin{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1025	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1026	If for all $r \in rs. \, \good \; r $ or $r = \RZERO$,
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1027	then for all $r \in \rflts\; rs. \, \good \; r$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1028	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1029	If $x \in \rdistinct{\rflts\; (\map \; rsimp{}\; rs)}{\varnothing}$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1030	and for all $y$ such that $\llbracket y \rrbracket_r$ less than
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1031	$\llbracket rs \rrbracket_r + 1$, either
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1032	$\good \; (\rsimp{y})$ or $\rsimp{y} = \RZERO$,
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1033	then $\good \; x$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1034	\end{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1035	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1036	\begin{proof}
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	1037	The first part is by induction, where the inductive cases
bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	1038	are the inductive cases of $\rflts$.
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1039	The second part is a corollary from the first part.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1040	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1041
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1042	This leads to good structural property of $\rsimp{}$,
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1043	that after simplification, a regular expression is
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1044	either good or $\RZERO$:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1045	\begin{lemma}\label{good1}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1046	For any r-regular expression $r$, $\good \; \rsimp{r}$ or $\rsimp{r} = \RZERO$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1047	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1048	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1049	By an induction on $r$. The inductive measure is the size $\llbracket \rrbracket_r$.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1050	Lemma \ref{rsimpMono} says that
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1051	$\llbracket \rsimp{r}\rrbracket_r$ is smaller than or equal to
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1052	$\llbracket r \rrbracket_r$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1053	Therefore, in the $r_1 \cdot r_2$ and $\sum rs$ case,
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	1054	The inductive hypothesis applies to the children regular expressions
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1055	$r_1$, $r_2$, etc. The lemma \ref{flts3Obv}'s precondition is satisfied
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1056	by that as well.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1057	The lemmas \ref{nonnestedRsimp} and \ref{nonaltFltsRd} are used
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1058	to ensure that goodness is preserved at the topmost level.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1059	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1060	We shall prove that any good regular expression is
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1061	a fixed-point for $\textit{rsimp}$.
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1062	First we prove an auxiliary lemma:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1063	\begin{lemma}\label{goodaltsNonalt}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1064	If $\good \; \sum rs$, then $\rflts\; rs = rs$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1065	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1066	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1067	By an induction on $\sum rs$. The inductive rules are the cases
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1068	for $\good$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1069	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1070	\noindent
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1071	Now we are ready to prove that good regular expressions are invariant
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1072	with respect to $\rsimp{}$:
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1073	\begin{lemma}\label{test}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1074	If $\good \;r$ then $\rsimp{r} = r$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1075	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1076	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1077	By an induction on the inductive cases of $\good$, using lemmas
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1078	\ref{goodaltsNonalt} and \ref{rdistinctOnDistinct}.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1079	The lemma \ref{goodaltsNonalt} is used in the alternative
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1080	case where 2 or more elements are present in the list.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1081	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1082	\noindent
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1083	Below we show a property involving $\rflts$, $\textit{rdistinct}$,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1084	$\rsimp{}$ and $\rsimp_{ALTS}$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1085	which requires $\ref{good1}$ to go through smoothly:
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1086	\begin{lemma}\label{flattenRsimpalts}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1087	An application of $\rsimp_{ALTS}$ can be ``absorbed'',
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1088	if its output is concatenated with a list and then applied to $\rflts$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1089	\begin{center}
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1090	$\rflts \; ( (\rsimp_{ALTS} \;
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1091	(\rdistinct{(\rflts \; (\map \; \rsimp{}\; rs))}{\varnothing})) ::
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1092	\map \; \rsimp{} \; rs' ) =
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1093	\rflts \; ( (\rdistinct{(\rflts \; (\map \; \rsimp{}\; rs))}{\varnothing}) @ (
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1094	\map \; \rsimp{rs'}))$
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1095	\end{center}
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1096
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1097
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1098	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1099	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1100	By \ref{good1}.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1101	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1102	\noindent
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1103
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1104
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1105
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1106
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1107
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1108	We are also ready to prove that $\textit{rsimp}$ is idempotent.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1109	\subsubsection{$\rsimp$ is Idempotent}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1110	The idempotency of $\rsimp$ is very useful in
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1111	manipulating regular expression terms into desired
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1112	forms so that key steps allowing further rewriting to closed forms
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1113	are possible.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1114	\begin{lemma}\label{rsimpIdem}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1115	$\rsimp{r} = \rsimp{(\rsimp{r})}$
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1116	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1117
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1118	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1119	By \ref{test} and \ref{good1}.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1120	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1121	\noindent
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1122	This property means we do not have to repeatedly
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1123	apply simplification in each step, which justifies
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1124	our definition of $\blexersimp$.
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1125	This is in contrast to the work of Sulzmann and Lu where
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1126	the simplification is applied in a fixpoint manner.
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1127
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1128
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1129	On the other hand, we can repeat the same $\rsimp{}$ applications
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1130	on regular expressions as many times as we want, if we have at least
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1131	one simplification applied to it, and apply it wherever we need to:
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1132	\begin{corollary}\label{headOneMoreSimp}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1133	The following properties hold, directly from \ref{rsimpIdem}:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1134
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1135	\begin{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1136	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1137	$\map \; \rsimp{(r :: rs)} = \map \; \rsimp{} \; (\rsimp{r} :: rs)$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1138	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1139	$\rsimp{(\RALTS{rs})} = \rsimp{(\RALTS{\map \; \rsimp{} \; rs})}$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1140	\end{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1141	\end{corollary}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1142	\noindent
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	1143	This will be useful in the later closed-form proof's rewriting steps.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1144	Similarly, we state the following useful facts below:
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1145	\begin{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1146	The following equalities hold if $r = \rsimp{r'}$ for some $r'$:
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1147	\begin{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1148	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1149	If $r = \sum rs$ then $\rsimpalts \; rs = \sum rs$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1150	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1151	If $r = \sum rs$ then $\rdistinct{rs}{\varnothing} = rs$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1152	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1153	$\rsimpalts \; (\rdistinct{\rflts \; [r]}{\varnothing}) = r$.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1154	\end{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1155	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1156	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1157	By application of lemmas \ref{rsimpIdem} and \ref{good1}.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1158	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1159
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1160	\noindent
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1161	With the idempotency of $\textit{rsimp}$ and its corollaries,
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1162	we can start proving some key equalities leading to the
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1163	closed forms.
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1164	Next we present a few equivalent terms under $\textit{rsimp}$.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1165	To make the notation more concise
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1166	We use $r_1 \sequal r_2 $ to denote that $\rsimp{r_1} = \rsimp{r_2}$.
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1167	%\begin{center}
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1168	%\begin{tabular}{lcl}
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1169	% $a \sequal b$ & $ \dn$ & $ \textit{rsimp} \; a = \textit{rsimp} \; b$
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1170	%\end{tabular}
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1171	%\end{center}
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1172	%\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1173	%\vspace{0em}
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1174	\begin{lemma}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1175	The following equivalence hold:
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1176	\begin{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1177	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1178	$\rsimpalts \; (\RZERO :: rs) \sequal \rsimpalts\; rs$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1179	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1180	$\rsimpalts \; rs \sequal \rsimpalts (\map \; \rsimp{} \; rs)$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1181	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1182	$\RALTS{\RALTS{rs}} \sequal \RALTS{rs}$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1183	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1184	$\sum ((\sum rs_a) :: rs_b) \sequal \sum rs_a @ rs_b$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1185	\item
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1186	$\RALTS{rs} \sequal \RALTS{\map \; \rsimp{} \; rs}$
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1187	\end{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1188	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1189	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1190	By induction on the lists involved.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1191	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1192	\noindent
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1193	The above allows us to prove
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1194	two similar equalities (which are a bit more involved).
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1195	It says that we could flatten the elements
614 d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1196	before simplification and still get the same result.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1197	\begin{lemma}\label{simpFlatten3}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1198	One can flatten the inside $\sum$ of a $\sum$ if it is being
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1199	simplified. Concretely,
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1200	\begin{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1201	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1202	If for all $r \in rs, rs', rs''$, we have $\good \; r $
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1203	or $r = \RZERO$, then $\sum (rs' @ rs @ rs'') \sequal
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1204	\sum (rs' @ [\sum rs] @ rs'')$ holds. As a corollary,
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1205	\item
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1206	$\sum (rs' @ [\sum rs] @ rs'') \sequal \sum (rs' @ rs @ rs'')$
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1207	\end{itemize}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1208	\end{lemma}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1209	\begin{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1210	By rewriting steps involving the use of \ref{test} and \ref{rdistinctConcatGeneral}.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1211	The second sub-lemma is a corollary of the previous.
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1212	\end{proof}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1213	%Rewriting steps not put in--too long and complicated-------------------------------
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1214	\begin{comment}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1215	\begin{center}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1216	$\rsimp{\sum (rs' @ rs @ rs'')} \stackrel{def of bsimp}{=}$ \\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1217	$\rsimpalts \; (\rdistinct{\rflts \; ((\map \; \rsimp{}\; rs') @ (\map \; \rsimp{} \; rs ) @ (\map \; \rsimp{} \; rs''))}{\varnothing})$ \\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1218	$\stackrel{by \ref{test}}{=}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1219	\rsimpalts \; (\rdistinct{(\rflts \; rs' @ \rflts \; rs @ \rflts \; rs'')}{
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1220	\varnothing})$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1221	$\stackrel{by \ref{rdistinctConcatGeneral}}{=}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1222	\rsimpalts \; (\rdistinct{\rflts \; rs'}{\varnothing} @ \rdistinct{(
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1223	\rflts\; rs @ \rflts \; rs'')}{\rflts \; rs'})$\\
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1224
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1225	\end{center}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1226	\end{comment}
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1227	%Rewriting steps not put in--too long and complicated-------------------------------
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1228	\noindent
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1229
d5e9bcb384ec reorder Chengsong parents: 613 diff changeset	1230
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1231	We need more equalities like the above to enable a closed form lemma,
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1232	for which we need to introduce a few rewrite relations
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1233	to help
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1234	us obtain them.
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	1235
610 d028c662a3df data files Chengsong parents: 609 diff changeset	1236	\subsection{The rewrite relation $\hrewrite$ , $\scfrewrites$ , $\frewrite$ and $\grewrite$}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1237	Inspired by the success we had in the correctness proof
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1238	in \ref{Bitcoded2},
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1239	we follow suit here, defining atomic simplification
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1240	steps as ``small-step'' rewriting steps. This allows capturing
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1241	similarities between terms that would be otherwise
aecf1ddf3541 more Chengsong parents: 554 diff changeset	1242	hard to express.
aecf1ddf3541 more Chengsong parents: 554 diff changeset	1243
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1244	We use $\hrewrite$ for one-step atomic rewrite of
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1245	regular expression simplification,
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1246	$\frewrite$ for rewrite of list of regular expressions that
aecf1ddf3541 more Chengsong parents: 554 diff changeset	1247	include all operations carried out in $\rflts$, and $\grewrite$ for
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1248	rewriting a list of regular expressions possible in both $\rflts$ and $\textit{rdistinct}$.
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1249	Their reflexive transitive closures are used to denote zero or many steps,
aecf1ddf3541 more Chengsong parents: 554 diff changeset	1250	as was the case in the previous chapter.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1251	As we have already
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1252	done something similar, the presentation about
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1253	these rewriting rules will be more concise than that in \ref{Bitcoded2}.
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	1254	To differentiate between the rewriting steps for annotated regular expressions
15d182ffbc76 more Chengsong parents: 553 diff changeset	1255	and $\rrexp$s, we add characters $h$ and $g$ below the squig arrow symbol
15d182ffbc76 more Chengsong parents: 553 diff changeset	1256	to mean atomic simplification transitions
15d182ffbc76 more Chengsong parents: 553 diff changeset	1257	of $\rrexp$s and $\rrexp$ lists, respectively.
15d182ffbc76 more Chengsong parents: 553 diff changeset	1258
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1259
aecf1ddf3541 more Chengsong parents: 554 diff changeset	1260
aecf1ddf3541 more Chengsong parents: 554 diff changeset	1261
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1262	\begin{figure}[H]
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	1263	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1264	\begin{mathpar}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1265	\inferrule[RSEQ0L]{}{\RZERO \cdot r_2 \hrewrite \RZERO\\}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1266
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1267	\inferrule[RSEQ0R]{}{r_1 \cdot \RZERO \hrewrite \RZERO\\}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1268
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1269	\inferrule[RSEQ1]{}{(\RONE \cdot r) \hrewrite r\\}\\
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1270
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1271	\inferrule[RSEQL]{ r_1 \hrewrite r_2}{r_1 \cdot r_3 \hrewrite r_2 \cdot r_3\\}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1272
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1273	\inferrule[RSEQR]{ r_3 \hrewrite r_4}{r_1 \cdot r_3 \hrewrite r_1 \cdot r_4\\}\\
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1274
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1275	\inferrule[RALTSChild]{r \hrewrite r'}{\sum (rs_1 @ [r] @ rs_2) \hrewrite \sum (rs_1 @ [r'] @ rs_2)\\}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1276
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1277	\inferrule[RALTS0]{}{\sum (rs_a @ [\RZERO] @ rs_b) \hrewrite \sum (rs_a @ rs_b)}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1278
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1279	\inferrule[RALTSNested]{}{\sum (rs_a @ [\sum rs_1] @ rs_b) \hrewrite \sum (rs_a @ rs_1 @ rs_b)}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1280
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1281	\inferrule[RALTSNil]{}{ \sum [] \hrewrite \RZERO\\}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1282
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1283	\inferrule[RALTSSingle]{}{ \sum [r] \hrewrite r\\}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1284
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1285	\inferrule[RALTSDelete]{\\ r_1 = r_2}{\sum rs_a @ [r_1] @ rs_b @ [r_2] @ rsc \hrewrite \sum rs_a @ [r_1] @ rs_b @ rs_c}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1286
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1287	\end{mathpar}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1288	\end{center}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1289	\caption{List of one-step rewrite rules for r-regular expressions ($\hrewrite$)}\label{hRewrite}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1290	\end{figure}
554 15d182ffbc76 more Chengsong parents: 553 diff changeset	1291
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1292
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1293	Like $\rightsquigarrow_s$, it is
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1294	convenient to define rewrite rules for a list of regular expressions,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1295	where each element can rewrite in many steps to the other (scf stands for
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1296	li\emph{s}t \emph{c}losed \emph{f}orm). This relation is similar to the
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1297	$\stackrel{s*}{\rightsquigarrow}$ for annotated regular expressions.
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1298
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1299	\begin{figure}[H]
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1300	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1301	\begin{mathpar}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1302	\inferrule{}{[] \scfrewrites [] }
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1303
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1304	\inferrule{r \hrewrites r' \\ rs \scfrewrites rs'}{r :: rs \scfrewrites r' :: rs'}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1305	\end{mathpar}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1306	\end{center}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1307	\caption{List of one-step rewrite rules for a list of r-regular expressions}\label{scfRewrite}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1308	\end{figure}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1309	%frewrite
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1310	List of one-step rewrite rules for flattening
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1311	a list of regular expressions($\frewrite$):
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1312	\begin{figure}[H]
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1313	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1314	\begin{mathpar}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1315	\inferrule{}{\RZERO :: rs \frewrite rs \\}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1316
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1317	\inferrule{}{(\sum rs) :: rs_a \frewrite rs @ rs_a \\}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1318
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1319	\inferrule{rs_1 \frewrite rs_2}{r :: rs_1 \frewrite r :: rs_2}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1320	\end{mathpar}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1321	\end{center}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1322	\caption{List of one-step rewrite rules characterising the $\rflts$ operation on a list}\label{fRewrites}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1323	\end{figure}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1324
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1325	Lists of one-step rewrite rules for flattening and de-duplicating
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1326	a list of regular expressions ($\grewrite$):
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1327	\begin{figure}[H]
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1328	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1329	\begin{mathpar}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1330	\inferrule{}{\RZERO :: rs \grewrite rs \\}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	1331
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1332	\inferrule{}{(\sum rs) :: rs_a \grewrite rs @ rs_a \\}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1333
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1334	\inferrule{rs_1 \grewrite rs_2}{r :: rs_1 \grewrite r :: rs_2}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1335
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1336	\inferrule[dB]{}{rs_a @ [a] @ rs_b @[a] @ rs_c \grewrite rs_a @ [a] @ rsb @ rsc}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1337	\end{mathpar}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1338	\end{center}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1339	\caption{List of one-step rewrite rules characterising the $\rflts$ and $\textit{rdistinct}$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1340	operations}\label{gRewrite}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1341	\end{figure}
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1342	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1343	We define
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1344	two separate list rewriting relations $\frewrite$ and $\grewrite$.
611 bc1df466150a more Chengsong parents: 610 diff changeset	1345	The rewriting steps that take place during
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1346	flattening are characterised by $\frewrite$.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1347	The rewrite relation $\grewrite$ characterises both flattening and de-duplicating.
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1348	Sometimes $\grewrites$ is slightly too powerful
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1349	so we would rather use $\frewrites$ to prove
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1350	%because we only
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1351	equalities related to $\rflts$.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1352	%certain equivalence under the rewriting steps of $\frewrites$.
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1353	For example, when proving the closed-form for the alternative regular expression,
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1354	one of the equalities needed is:
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1355	\begin{center}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1356	$\sum (\rDistinct \;\; (\map \; (\_ \backslash x) \; (\rflts \; rs)) \;\; \varnothing) \sequal
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1357	\sum (\rDistinct \;\; (\rflts \; (\map \; (\_ \backslash x) \; rs)) \;\; \varnothing)
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1358	$
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1359	\end{center}
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1360	\noindent
c27f04bb2262 hello Chengsong parents: 555 diff changeset	1361	Proving this is by first showing
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1362	\begin{lemma}\label{earlyLaterDerFrewrites}
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1363	$\map \; (\_ \backslash x) \; (\rflts \; rs) \frewrites
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1364	\rflts \; (\map \; (\_ \backslash x) \; rs)$
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1365	\end{lemma}
c27f04bb2262 hello Chengsong parents: 555 diff changeset	1366	\noindent
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1367	and then the equivalence between two terms
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1368	that can reduce in many steps to each other:
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1369	\begin{lemma}\label{frewritesSimpeq}
c27f04bb2262 hello Chengsong parents: 555 diff changeset	1370	If $rs_1 \frewrites rs_2 $, then $\sum (\rDistinct \; rs_1 \; \varnothing) \sequal
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1371	\sum (\rDistinct \; rs_2 \; \varnothing)$.
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1372	\end{lemma}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1373	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1374	These two lemmas can both be proven using a straightforward induction (and
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1375	the proofs for them are therefore omitted).
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1376
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1377	Now the above equalities can be derived with ease:
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1378	\begin{corollary}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1379	$\sum (\rDistinct \;\; (\map \; (\_ \backslash x) \; (\rflts \; rs)) \;\; \varnothing) \sequal
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1380	\sum (\rDistinct \;\; (\rflts \; (\map \; (\_ \backslash x) \; rs)) \;\; \varnothing)
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1381	$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1382	\end{corollary}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1383	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1384	By lemmas \ref{earlyLaterDerFrewrites} and \ref{frewritesSimpeq}.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1385	\end{proof}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1386	But this trick will not work for $\grewrites$.
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1387	For example, a rewriting step in proving
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1388	closed forms is:
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1389	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1390	$\rsimp{(\rsimpalts \; (\map \; (\_ \backslash x) \; (\rdistinct{(\rflts \; (\map \; (\rsimp{} \; \circ \; (\lambda r. \rderssimp{r}{xs}))))}{\varnothing})))}$\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1391	$=$ \\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1392	$\rsimp{(\rsimpalts \; (\rdistinct{(\map \; (\_ \backslash x) \; (\rflts \; (\map \; (\rsimp{} \; \circ \; (\lambda r. \rderssimp{r}{xs})))) ) }{\varnothing}))} $
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1393	\noindent
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1394	\end{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1395	For this, one would hope to have a rewriting relation between the two lists involved,
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1396	similar to \ref{earlyLaterDerFrewrites}. However, it turns out that
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1397	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1398	$\map \; (\_ \backslash x) \; (\rDistinct \; rs \; rset) \grewrites \rDistinct \; (\map \;
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1399	(\_ \backslash x) \; rs) \; ( rset \backslash x)$
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1400	\end{center}
c27f04bb2262 hello Chengsong parents: 555 diff changeset	1401	\noindent
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1402	does $\mathbf{not}$ hold in general.
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1403	For this rewriting step we will introduce some slightly more cumbersome
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1404	proof technique later.
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1405	The point is that $\frewrite$
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1406	allows us to prove equivalence in a straightforward way that is
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1407	not possible for $\grewrite$.
555 aecf1ddf3541 more Chengsong parents: 554 diff changeset	1408
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1409
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1410	\subsubsection{Terms That Can Be Rewritten Using $\hrewrites$, $\grewrites$, and $\frewrites$}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1411	In this part, we present lemmas stating
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1412	pairs of r-regular expressions and r-regular expression lists
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1413	where one can rewrite from one in many steps to the other.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1414	Most of the proofs to these lemmas are straightforward, using
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1415	an induction on the corresponding rewriting relations.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1416	These proofs will therefore be omitted when this is the case.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1417	We present in the following lemma a few pairs of terms that are rewritable via
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1418	$\grewrites$:
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1419	\begin{lemma}\label{gstarRdistinctGeneral}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1420	\mbox{}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1421	\begin{itemize}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1422	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1423	$rs_1 @ rs \grewrites rs_1 @ (\rDistinct \; rs \; rs_1)$
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1424	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1425	$rs \grewrites \rDistinct \; rs \; \varnothing$
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1426	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1427	$rs_a @ (\rDistinct \; rs \; rs_a) \grewrites rs_a @ (\rDistinct \;
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1428	rs \; (\{\RZERO\} \cup rs_a))$
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1429	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1430	$rs \;\; @ \;\; \rDistinct \; rs_a \; rset \grewrites rs @ \rDistinct \; rs_a \;
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1431	(rest \cup rs)$
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1432
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1433	\end{itemize}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1434	\end{lemma}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1435	\noindent
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1436	If a pair of terms $rs_1, rs_2$ are rewritable via $\grewrites$ to each other,
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1437	then they are equivalent under $\rsimp{}$:
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1438	\begin{lemma}\label{grewritesSimpalts}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1439	\mbox{}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1440	If $rs_1 \grewrites rs_2$, then
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1441	we have the following equivalence:
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1442	\begin{itemize}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1443	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1444	$\sum rs_1 \sequal \sum rs_2$
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1445	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1446	$\rsimpalts \; rs_1 \sequal \rsimpalts \; rs_2$
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1447	\end{itemize}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1448	\end{lemma}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1449	\noindent
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1450	Here are a few connecting lemmas showing that
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1451	if a list of regular expressions can be rewritten using $\grewrites$ or $\frewrites $ or
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1452	$\scfrewrites$,
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1453	then an alternative constructor taking the list can also be rewritten using $\hrewrites$:
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1454	\begin{lemma}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1455	\begin{itemize}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1456	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1457	If $rs \grewrites rs'$ then $\sum rs \hrewrites \sum rs'$.
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1458	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1459	If $rs \grewrites rs'$ then $\sum rs \hrewrites \rsimpalts \; rs'$
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1460	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1461	If $rs_1 \scfrewrites rs_2$ then $\sum (rs @ rs_1) \hrewrites \sum (rs @ rs_2)$
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1462	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1463	If $rs_1 \scfrewrites rs_2$ then $\sum rs_1 \hrewrites \sum rs_2$
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1464
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1465	\end{itemize}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1466	\end{lemma}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1467	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1468	Now comes the core of the proof,
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1469	which says that once two lists are rewritable to each other,
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1470	then they are equivalent under $\textit{rsimp}$:
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1471	\begin{lemma}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1472	If $r_1 \hrewrites r_2$ then $r_1 \sequal r_2$.
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1473	\end{lemma}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1474
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1475	\noindent
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1476	Similar to what we did in chapter \ref{Bitcoded2},
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1477	we prove that if one can rewrite from one r-regular expression ($r$)
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1478	to the other ($r'$), after taking derivatives one can still rewrite
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1479	the first ($r\backslash c$) to the other ($r'\backslash c$).
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1480	\begin{lemma}\label{interleave}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1481	If $r \hrewrites r' $ then $\rder{c}{r} \hrewrites \rder{c}{r'}$
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1482	\end{lemma}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1483	\noindent
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	1484	This allows us to prove more $\mathbf{rsimp}$-equivalent terms, involving $\backslash_r$.
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1485	\begin{lemma}\label{insideSimpRemoval}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1486	$\rsimp{(\rder{c}{(\rsimp{r})})} = \rsimp{(\rder{c}{r})} $
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1487	\end{lemma}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1488	\noindent
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1489	\begin{proof}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1490	By \ref{interleave} and \ref{rsimpIdem}.
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1491	\end{proof}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1492	\noindent
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1493	And this unlocks more equivalent terms:
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1494	\begin{lemma}\label{Simpders}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1495	As corollaries of \ref{insideSimpRemoval}, we have
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1496	\begin{itemize}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1497	\item
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	1498	If $s \neq []$ then $\rderssimp{r}{s} = \rsimp{( r \backslash_{rs} s)}$.
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1499	\item
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1500	$\rsimpalts \; (\map \; (\_ \backslash_r x) \;
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1501	(\rdistinct{rs}{\varnothing})) \sequal
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1502	\rsimpalts \; (\rDistinct \;
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1503	(\map \; (\_ \backslash_r x) rs) \;\varnothing )$
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1504	\end{itemize}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1505	\end{lemma}
611 bc1df466150a more Chengsong parents: 610 diff changeset	1506	\begin{proof}
bc1df466150a more Chengsong parents: 610 diff changeset	1507	Part 1 is by lemma \ref{insideSimpRemoval},
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1508	part 2 is by lemma \ref{insideSimpRemoval} .%and \ref{distinctDer}.
611 bc1df466150a more Chengsong parents: 610 diff changeset	1509	\end{proof}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1510	\noindent
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1511
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1512	\subsection{Closed Forms for $\sum rs$, $r_1\cdot r_2$ and $r^*$}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1513	Lemma \ref{Simpders} leads to our first closed form,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1514	which is for the alternative regular expression:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1515	\begin{theorem}\label{altsClosedForm}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1516	\mbox{}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1517	\begin{center}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1518	$\rderssimp{(\sum rs)}{s} \sequal
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1519	\sum \; (\map \; (\rderssimp{\_}{s}) \; rs)$
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1520	\end{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1521	\end{theorem}
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1522	\noindent
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1523	\begin{proof}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1524	By a reverse induction on the string $s$.
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1525	One rewriting step, as we mentioned earlier,
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1526	involves
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1527	\begin{center}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1528	$\rsimpalts \; (\map \; (\_ \backslash x) \;
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1529	(\rdistinct{(\rflts \; (\map \; (\rsimp{} \; \circ \;
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1530	(\lambda r. \rderssimp{r}{xs}))))}{\varnothing}))
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1531	\sequal
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1532	\rsimpalts \; (\rdistinct{(\map \; (\_ \backslash x) \;
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1533	(\rflts \; (\map \; (\rsimp{} \; \circ \;
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1534	(\lambda r. \rderssimp{r}{xs})))) ) }{\varnothing}) $.
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1535	\end{center}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1536	This can be proven by a combination of
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1537	\ref{grewritesSimpalts}, \ref{gstarRdistinctGeneral}, \ref{rderRsimpAltsCommute}, and
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1538	\ref{insideSimpRemoval}.
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1539	\end{proof}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1540	\noindent
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1541	This closed form has a variant which can be more convenient in later proofs:
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1542	\begin{corollary}\label{altsClosedForm1}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1543	If $s \neq []$ then
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1544	$\rderssimp \; (\sum \; rs) \; s =
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1545	\rsimp{(\sum \; (\map \; \rderssimp{\_}{s} \; rs))}$.
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1546	\end{corollary}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1547	\noindent
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1548	The harder closed forms are the sequence and star ones.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	1549	Before we obtain them, some preliminary definitions
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1550	are needed to make proof statements concise.
556 c27f04bb2262 hello Chengsong parents: 555 diff changeset	1551
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	1552
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	1553	\subsubsection{Closed Form for Sequence Regular Expressions}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1554	For the sequence regular expression,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1555	let's first look at a series of derivative steps on it
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1556	(assuming that each time when a derivative is taken,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1557	the head of the sequence is always nullable):
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1558	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1559	\begin{tabular}{llll}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1560	$r_1 \cdot r_2$ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1561	$\longrightarrow_{\backslash c}$ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1562	$r_1\backslash c \cdot r_2 + r_2 \backslash c$ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1563	$ \longrightarrow_{\backslash c'} $ \\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1564	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1565	$(r_1 \backslash cc' \cdot r_2 + r_2 \backslash c') + r_2 \backslash cc'$ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1566	$\longrightarrow_{\backslash c''} $ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1567	$((r_1 \backslash cc'c'' \cdot r_2 + r_2 \backslash c'') + r_2 \backslash c'c'')
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1568	+ r_2 \backslash cc'c''$ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1569	$ \longrightarrow_{\backslash c''} \quad \ldots$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1570	\end{tabular}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1571	\end{center}
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	1572	Roughly speaking $r_1 \cdot r_2 \backslash s$ can be expressed as
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1573	a giant alternative taking a list of terms
671a83abccf3 haha Chengsong parents: 557 diff changeset	1574	$[r_1 \backslash_r s \cdot r_2, r_2 \backslash_r s'', r_2 \backslash_r s_1'', \ldots]$,
671a83abccf3 haha Chengsong parents: 557 diff changeset	1575	where the head of the list is always the term
671a83abccf3 haha Chengsong parents: 557 diff changeset	1576	representing a match involving only $r_1$, and the tail of the list consisting of
671a83abccf3 haha Chengsong parents: 557 diff changeset	1577	terms of the shape $r_2 \backslash_r s''$, $s''$ being a suffix of $s$.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1578	This intuition is also echoed by Murugesan and Sundaram \cite{Murugesan2014},
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1579	where they gave
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1580	a pencil-and-paper derivation of $(r_1 \cdot r_2)\backslash s$:
532 cc54ce075db5 restructured Chengsong parents: diff changeset	1581	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1582	\begin{tabular}{lc}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1583	$L \; [ (r_1 \cdot r_2) \backslash_r (c_1 :: c_2 :: \ldots c_n) ]$ & $ =$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1584	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1585	\rule{0pt}{3ex} $L \; [ ((r_1 \backslash_r c_1) \cdot r_2 +
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1586	(\delta\; (\nullable \; r_1) \; (r_2 \backslash_r c_1) )) \backslash_r
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1587	(c_2 :: \ldots c_n) ]$ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1588	$=$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1589	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1590	\rule{0pt}{3ex} $L \; [ ((r_1 \backslash_r c_1c_2 \cdot r_2 +
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1591	(\delta \; (\nullable \; r_1) \; (r_2 \backslash_r c_1c_2)))
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1592	$ & \\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1593	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1594	$\quad + (\delta \ (\nullable \; r_1 \backslash_r c)\; (r_2 \backslash_r c_2) ))
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1595	\backslash_r (c_3 \ldots c_n) ]$ & $\ldots$ \\
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1596	\end{tabular}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1597	\end{center}
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1598	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1599	The $\delta$ function
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1600	returns $r$ when the boolean condition
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1601	$b$ evaluates to true and
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1602	$\ZERO_r$ otherwise:
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1603	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1604	\begin{tabular}{lcl}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1605	$\delta \; b\; r$ & $\dn$ & $r \quad \textit{if} \; b \; is \;\textit{true}$\\
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1606	& $\dn$ & $\ZERO_r \quad otherwise$
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1607	\end{tabular}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1608	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1609	\noindent
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1610	Note that the term
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1611	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1612	\begin{tabular}{lc}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1613	\rule{0pt}{3ex} $((r_1 \backslash_r c_1c_2 \cdot r_2 +
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1614	(\delta \; (\nullable \; r_1) \; (r_2 \backslash_r c_1c_2)))
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1615	$ & \\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1616	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1617	$\quad + (\delta \ (\nullable \; r_1 \backslash_r c)\; (r_2 \backslash_r c_2) ))
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1618	\backslash_r (c_3 \ldots c_n)$ &\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1619	\end{tabular}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1620	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1621	\noindent
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1622	does not faithfully
671a83abccf3 haha Chengsong parents: 557 diff changeset	1623	represent what the intermediate derivatives would actually look like
671a83abccf3 haha Chengsong parents: 557 diff changeset	1624	when one or more intermediate results $r_1 \backslash s' \cdot r_2$ are not
671a83abccf3 haha Chengsong parents: 557 diff changeset	1625	nullable in the head of the sequence.
671a83abccf3 haha Chengsong parents: 557 diff changeset	1626	For example, when $r_1$ and $r_1 \backslash_r c_1$ are not nullable,
671a83abccf3 haha Chengsong parents: 557 diff changeset	1627	the regular expression would not look like
671a83abccf3 haha Chengsong parents: 557 diff changeset	1628	\[
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1629	r_1 \backslash_r c_1c_2
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1630	\]
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1631	instead of
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1632	\[
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1633	(r_1 \backslash_r c_1c_2 + \ZERO_r ) + \ZERO_r.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1634	\]
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1635	The redundant $\ZERO_r$s will not be created in the
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1636	first place.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1637	In a closed-form one needs to take into account this (because
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1638	closed forms require exact equality rather than language equivalence)
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1639	and only generate the
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1640	$r_2 \backslash_r s''$ terms satisfying the property
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1641	\begin{center}
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	1642	$\exists s'. such \; that \; s'@s'' = s \;\; \land \;\;
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1643	r_1 \backslash s' \; is \; nullable$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1644	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1645	Given the arguments $s$ and $r_1$, we denote the list of strings
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1646	$s''$ satisfying the above property as $\vsuf{s}{r_1}$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1647	The function $\vsuf{\_}{\_}$ is defined recursively on the structure of the string\footnote{
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	1648	Perhaps a better name for it would be ``NullablePrefixSuffix''
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1649	to differentiate with the list of \emph{all} prefixes of $s$, but
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1650	that is a bit too long for a function name and we are yet to find
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1651	a more concise and easy-to-understand name.}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1652	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1653	\begin{tabular}{lcl}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1654	$\vsuf{[]}{\_} $ & $=$ & $[]$\\
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1655	$\vsuf{c::cs}{r_1}$ & $ =$ & $ \textit{if} \; (\rnullable{r_1}) \; \textit{then} \; (\vsuf{cs}{(\rder{c}{r_1})}) @ [c :: cs]$\\
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1656	&& $\textit{else} \; (\vsuf{cs}{(\rder{c}{r_1}) }) $
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1657	\end{tabular}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1658	\end{center}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1659	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1660	The list starts with shorter suffixes
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1661	and ends with longer ones (in other words, the string elements $s''$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1662	in the list $\vsuf{s}{r_1}$ are sorted
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1663	in the same order as that of the terms $r_2\backslash s''$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1664	appearing in $(r_1\cdot r_2)\backslash s$).
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1665	In essence, $\vsuf{\_}{\_}$ is doing a
671a83abccf3 haha Chengsong parents: 557 diff changeset	1666	"virtual derivative" of $r_1 \cdot r_2$, but instead of producing
671a83abccf3 haha Chengsong parents: 557 diff changeset	1667	the entire result $(r_1 \cdot r_2) \backslash s$,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1668	it only stores strings,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1669	with each string $s''$ representing a term such that $r_2 \backslash s''$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1670	is occurring in $(r_1\cdot r_2)\backslash s$.
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1671
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1672	With $\textit{Suffix}$ we are ready to express the
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1673	sequence regular expression's closed form,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1674	but before doing so
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1675	more definitions are needed.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1676	The first thing is the flattening function $\sflat{\_}$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1677	which takes an alternative regular expression and produces a flattened version
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1678	of that alternative regular expression.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1679	It is needed to convert
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1680	a left-associative nested sequence of alternatives into
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1681	a flattened list:
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1682	\[
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1683	\sum(\ldots ((r_1 + r_2) + r_3) + \ldots)
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1684	\stackrel{\sflat{\_}}{\rightarrow}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1685	\sum[r_1, r_2, r_3, \ldots]
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1686	\]
671a83abccf3 haha Chengsong parents: 557 diff changeset	1687	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1688	The definitions of $\sflat{\_}$ and helper functions
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1689	$\sflataux{\_}$ and $\llparenthesis \_ \rrparenthesis''$ are given below.
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1690	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1691	\begin{tabular}{lcl}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1692	$\sflataux{\sum r :: rs}$ & $\dn$ & $\sflataux{r} @ rs$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1693	$\sflataux{\sum []}$ & $ \dn $ & $ []$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1694	$\sflataux r$ & $\dn$ & $ [r]$
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1695	\end{tabular}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	1696	\end{center}
cc54ce075db5 restructured Chengsong parents: diff changeset	1697
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1698	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1699	\begin{tabular}{lcl}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1700	$\sflat{(\sum r :: rs)}$ & $\dn$ & $\sum (\sflataux{r} @ rs)$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1701	$\sflat{\sum []}$ & $ \dn $ & $ \sum []$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1702	$\sflat r$ & $\dn$ & $ r$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1703	\end{tabular}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1704	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1705
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1706	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1707	\begin{tabular}{lcl}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1708	$\sflataux{[]}'$ & $ \dn $ & $ []$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1709	$\sflataux{ (r_1 + r_2) :: rs }'$ & $\dn$ & $r_1 :: r_2 :: rs$\\
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1710	$\sflataux{r :: rs}'$ & $\dn$ & $ r::rs$
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1711	\end{tabular}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1712	\end{center}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1713	\noindent
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	1714	$\sflataux{\_}$ breaks up nested alternative regular expressions
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1715	of the $(\ldots((r_1 + r_2) + r_3) + \ldots )$(left-associated) shape
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1716	into a "balanced" list: $\AALTS{\_}{[r_1,\, r_2 ,\, r_3, \ldots]}$.
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1717	It will return the singleton list $[r]$ otherwise.
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1718	$\sflat{\_}$ works the same as $\sflataux{\_}$, except that it keeps
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1719	the output type a regular expression, not a list.
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1720	$\sflataux{\_}$ and $\sflat{\_}$ are only recursive on the
671a83abccf3 haha Chengsong parents: 557 diff changeset	1721	first element of the list.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1722	$\sflataux{\_}'$ takes a list of regular expressions as input, and outputs
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1723	a list of regular expressions.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1724	The use of $\sflataux{\_}$ and $\sflataux{\_}'$ is clear once we have
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1725	$\textit{createdBySequence}$ defined:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1726	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1727	\begin{mathpar}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1728	\inferrule{\mbox{}}{\textit{createdBySequence}\; (r_1 \cdot r_2)}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1729
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1730	\inferrule{\textit{createdBySequence} \; r_1}{\textit{createdBySequence} \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1731	(r_1 + r_2)}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1732	\end{mathpar}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1733	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1734	\noindent
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1735	The predicate $\textit{createdBySequence}$ is used to describe the shape of
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1736	the derivative regular expressions $(r_1\cdot r_2) \backslash s$:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1737	\begin{lemma}\label{recursivelyDerseq}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1738	It is always the case that
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1739	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1740	$\textit{createdBySequence} \; ( (r_1\cdot r_2) \backslash_r s) $
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1741	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1742	holds.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1743	\end{lemma}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1744	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1745	By a reverse induction on the string $s$, where the inductive cases are $[]$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1746	and $xs @ [x]$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1747	\end{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1748	\noindent
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1749	If we have a regular expression $r$ whose shape
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1750	fits into those described by $\textit{createdBySequence}$,
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1751	then we can convert between
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1752	$r \backslash_r c$ and $(\sflataux{r}) \backslash_r c$ with
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1753	$\sflataux{\_}'$:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1754	\begin{lemma}\label{sfauIdemDer}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1755	If $\textit{createdBySequence} \; r$, then
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1756	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1757	$\sflataux{ r \backslash_r c} =
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1758	\llparenthesis (\map \; (\_ \backslash_r c) \; (\sflataux{r}) ) \rrparenthesis''$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1759	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1760	holds.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1761	\end{lemma}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1762	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1763	By a simple induction on the inductive cases of $\textit{createdBySequence}.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1764	$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1765	\end{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1766
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1767	Now we are ready to express
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1768	the shape of $r_1 \cdot r_2 \backslash s$
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1769	\begin{lemma}\label{seqSfau0}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1770	$\sflataux{(r_1 \cdot r_2) \backslash_r s} = (r_1 \backslash_r s) \cdot r_2
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1771	:: (\map \; (r_2 \backslash_r \_) \; (\textit{Suffix} \; s \; r_1))$
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1772	\end{lemma}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1773	\begin{proof}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1774	By a reverse induction on the string $s$, where the inductive cases
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1775	are $[]$ and $xs @ [x]$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1776	For the inductive case, we know that $\textit{createdBySequence} \; ((r_1 \cdot r_2)
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1777	\backslash_r xs)$ holds from lemma \ref{recursivelyDerseq},
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1778	which can be used to prove
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1779	\[
671a83abccf3 haha Chengsong parents: 557 diff changeset	1780	\map \; (r_2 \backslash_r \_) \; (\vsuf{[x]}{(r_1 \backslash_r xs)}) \;\; @ \;\;
671a83abccf3 haha Chengsong parents: 557 diff changeset	1781	\map \; (\_ \backslash_r x) \; (\map \; (r_2 \backslash \_) \; (\vsuf{xs}{r_1}))
671a83abccf3 haha Chengsong parents: 557 diff changeset	1782	\]
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1783	=
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1784	\[
671a83abccf3 haha Chengsong parents: 557 diff changeset	1785	\map \; (r_2 \backslash_r \_) \; (\vsuf{xs @ [x]}{r_1})
671a83abccf3 haha Chengsong parents: 557 diff changeset	1786	\]
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1787	using lemma \ref{sfauIdemDer}.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1788	This equality enables the inductive case to go through.
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1789	\end{proof}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1790	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1791	This lemma says that $(r_1\cdot r_2)\backslash s$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1792	can be flattened into a list whose head and tail meet the description
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1793	we gave earlier.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1794	%Note that this lemma does $\mathbf{not}$ depend on any
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1795	%specific definitions we used,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1796	%allowing people investigating derivatives to get an alternative
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1797	%view of what $r_1 \cdot r_2$ is.
532 cc54ce075db5 restructured Chengsong parents: diff changeset	1798
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1799	We now use $\textit{createdBySequence}$ and
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1800	$\sflataux{\_}$ to describe an intuition
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1801	behind the sequence closed form.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1802	If two regular expressions only differ in the way their
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1803	alternatives are nested, then we should be able to get the same result
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1804	once we apply simplification to both of them:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1805	\begin{lemma}\label{sflatRsimpeq}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1806	If $r$ is created from a sequence through
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1807	a series of derivatives
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1808	(i.e. if $\textit{createdBySequence} \; r$ holds),
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1809	and that $\sflataux{r} = rs$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1810	then we have
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1811	that
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1812	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1813	$\textit{rsimp} \; r = \textit{rsimp} \; (\sum \; rs)$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1814	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1815	holds.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1816	\end{lemma}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1817	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1818	By an induction on the inductive cases of $\textit{createdBySequence}$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1819	\end{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1820
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1821	Now we are ready for the closed form
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1822	for the sequence regular expressions (without the inner applications
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1823	of simplifications):
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1824	\begin{lemma}\label{seqClosedFormGeneral}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1825	$\rsimp{\sflat{(r_1 \cdot r_2) \backslash s} }
671a83abccf3 haha Chengsong parents: 557 diff changeset	1826	=\rsimp{(\sum ( (r_1 \backslash s) \cdot r_2 ::
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1827	\map\; (r_2 \backslash \_) \; (\vsuf{s}{r_1})))}$
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1828	\end{lemma}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1829	\begin{proof}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1830	We know that
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1831	$\sflataux{(r_1 \cdot r_2) \backslash_r s} = (r_1 \backslash_r s) \cdot r_2
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1832	:: (\map \; (r_2 \backslash_r \_) \; (\textit{Suffix} \; s \; r_1))$
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1833	holds
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1834	by lemma \ref{seqSfau0}.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1835	This allows the theorem to go through because of lemma \ref{sflatRsimpeq}.
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1836	\end{proof}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1837	Together with the idempotency property of $\rsimp{}$ (lemma \ref{rsimpIdem}),
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1838	it is possible to convert the above lemma to obtain the
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1839	proper closed form for $\backslash_{rsimps}$ rather than $\backslash_r$:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1840	for derivatives nested with simplification:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1841	\begin{theorem}\label{seqClosedForm}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1842	$\rderssimp{(r_1 \cdot r_2)}{s} = \rsimp{(\sum ((r_1 \backslash s) \cdot r_2 )
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1843	:: (\map \; (r_2 \backslash \_) (\vsuf{s}{r_1})))}$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1844	\end{theorem}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1845	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1846	By a case analysis of the string $s$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1847	When $s$ is an empty list, the rewrite is straightforward.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1848	When $s$ is a non-empty list, the
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1849	lemmas \ref{seqClosedFormGeneral} and \ref{Simpders} apply,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1850	making the proof go through.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1851	\end{proof}
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	1852	\subsubsection{Closed Forms for Star Regular Expressions}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1853	The closed form for the star regular expression involves similar tricks
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1854	for the sequence regular expression.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1855	The $\textit{Suffix}$ function is now replaced by something
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1856	slightly more complex, because the growth pattern of star
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1857	regular expressions' derivatives is a bit different:
564 3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	1858	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1859	\begin{tabular}{lclc}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1860	$r^* $ & $\longrightarrow_{\backslash c}$ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1861	$(r\backslash c) \cdot r^*$ & $\longrightarrow_{\backslash c'}$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1862	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1863	$r \backslash cc' \cdot r^* + r \backslash c' \cdot r^*$ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1864	$\longrightarrow_{\backslash c''}$ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1865	$(r_1 \backslash cc'c'' \cdot r^* + r \backslash c'') +
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1866	(r \backslash c'c'' \cdot r^* + r \backslash c'' \cdot r^*)$ &
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1867	$\longrightarrow_{\backslash c'''}$ \\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1868	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1869	$\ldots$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1870	\end{tabular}
564 3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	1871	\end{center}
3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	1872	When we have a string $s = c :: c' :: c'' \ldots$ such that $r \backslash c$, $r \backslash cc'$, $r \backslash c'$,
3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	1873	$r \backslash cc'c''$, $r \backslash c'c''$, $r\backslash c''$ etc. are all nullable,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1874	the number of terms in $r^* \backslash s$ will grow exponentially rather than linearly
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1875	in the sequence case.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1876	The good news is that the function $\textit{rsimp}$ will again
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	1877	ignore the difference between different nesting patterns of alternatives,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1878	and the exponentially growing star derivative like
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1879	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1880	$(r_1 \backslash cc'c'' \cdot r^* + r \backslash c'') +
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1881	(r \backslash c'c'' \cdot r^* + r \backslash c'' \cdot r^*) $
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1882	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1883	can be treated as
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1884	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1885	$\RALTS{[r_1 \backslash cc'c'' \cdot r^*, r \backslash c'',
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1886	r \backslash c'c'' \cdot r^, r \backslash c'' \cdot r^]}$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1887	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1888	which can be de-duplicated by $\rDistinct$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1889	and therefore bounded finitely.
564 3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	1890
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1891	%and then de-duplicate terms of the form ($s'$ being a substring of $s$).
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	1892	%This allows us to use a similar technique as $r_1 \cdot r_2$ case,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1893
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1894	Now the crux of this section is finding a suitable description
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1895	for $rs$ where
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1896	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1897	$\rderssimp{r^*}{s} = \rsimp{\sum rs}$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1898	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1899	holds.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1900	In addition, the list $rs$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1901	shall be in the form of
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1902	$\map \; (\lambda s'. r\backslash s' \cdot r^*) \; Ss$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1903	The $Ss$ is a list of strings, and for example in the sequence
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1904	closed form it is specified as $\textit{Suffix} \; s \; r_1$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1905	To get $Ss$ for the star regular expression,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1906	we need to introduce $\starupdate$ and $\starupdates$:
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1907	\begin{center}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1908	\begin{tabular}{lcl}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1909	$\starupdate \; c \; r \; [] $ & $\dn$ & $[]$\\
671a83abccf3 haha Chengsong parents: 557 diff changeset	1910	$\starupdate \; c \; r \; (s :: Ss)$ & $\dn$ & \\
671a83abccf3 haha Chengsong parents: 557 diff changeset	1911	& & $\textit{if} \;
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	1912	(\rnullable \; (r \backslash_{rs} s))$ \\
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1913	& & $\textit{then} \;\; (s @ [c]) :: [c] :: (
671a83abccf3 haha Chengsong parents: 557 diff changeset	1914	\starupdate \; c \; r \; Ss)$ \\
671a83abccf3 haha Chengsong parents: 557 diff changeset	1915	& & $\textit{else} \;\; (s @ [c]) :: (
671a83abccf3 haha Chengsong parents: 557 diff changeset	1916	\starupdate \; c \; r \; Ss)$
671a83abccf3 haha Chengsong parents: 557 diff changeset	1917	\end{tabular}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1918	\end{center}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1919	\begin{center}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1920	\begin{tabular}{lcl}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1921	$\starupdates \; [] \; r \; Ss$ & $=$ & $Ss$\\
671a83abccf3 haha Chengsong parents: 557 diff changeset	1922	$\starupdates \; (c :: cs) \; r \; Ss$ & $=$ & $\starupdates \; cs \; r \; (
671a83abccf3 haha Chengsong parents: 557 diff changeset	1923	\starupdate \; c \; r \; Ss)$
671a83abccf3 haha Chengsong parents: 557 diff changeset	1924	\end{tabular}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1925	\end{center}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1926	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1927	Assuming we have that
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1928	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1929	$\rderssimp{r^}{s} = \rsimp{(\sum \map \; (\lambda s'. r\backslash s' \cdot r^) \; Ss)}$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1930	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1931	holds.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1932	The idea of $\starupdate$ and $\starupdates$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1933	is to update $Ss$ when another
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1934	derivative is taken on $\rderssimp{r^*}{s}$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1935	w.r.t a character $c$ and a string $s'$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1936	respectively.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1937	Both $\starupdate$ and $\starupdates$ take three arguments as input:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1938	the new character $c$ or string $s$ to take derivative with,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1939	the regular expression
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1940	$r$ under the star $r^*$, and the
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1941	list of strings $Ss$ for the derivative $r^* \backslash s$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1942	up until this point
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1943	such that
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1944	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1945	$(r^) \backslash s = \sum_{s' \in sSet} (r\backslash s') \cdot r^$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1946	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1947	is satisfied.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1948
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1949	Functions $\starupdate$ and $\starupdates$ characterise what the
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1950	star derivatives will look like once ``straightened out'' into lists.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1951	The helper functions for such operations will be similar to
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1952	$\sflat{\_}$, $\sflataux{\_}$ and $\sflataux{\_}$, which we defined for sequence.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1953	We use similar symbols to denote them, with a $*$ subscript to mark the difference.
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1954	\begin{center}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1955	\begin{tabular}{lcl}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1956	$\hflataux{r_1 + r_2}$ & $\dn$ & $\hflataux{r_1} @ \hflataux{r_2}$\\
671a83abccf3 haha Chengsong parents: 557 diff changeset	1957	$\hflataux{r}$ & $\dn$ & $[r]$
671a83abccf3 haha Chengsong parents: 557 diff changeset	1958	\end{tabular}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1959	\end{center}
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	1960
812e5d112f49 more changes Chengsong parents: 556 diff changeset	1961	\begin{center}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	1962	\begin{tabular}{lcl}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1963	$\hflat{r_1 + r_2}$ & $\dn$ & $\sum (\hflataux {r_1} @ \hflataux {r_2}) $\\
671a83abccf3 haha Chengsong parents: 557 diff changeset	1964	$\hflat{r}$ & $\dn$ & $r$
671a83abccf3 haha Chengsong parents: 557 diff changeset	1965	\end{tabular}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1966	\end{center}
671a83abccf3 haha Chengsong parents: 557 diff changeset	1967	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1968	These definitions are tailor-made for dealing with alternatives that have
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1969	originated from a star's derivatives.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1970	A typical star derivative always has the structure of a balanced binary tree:
564 3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	1971	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1972	$(r_1 \backslash cc'c'' \cdot r^* + r \backslash c'') +
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1973	(r \backslash c'c'' \cdot r^* + r \backslash c'' \cdot r^*) $
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	1974	\end{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1975	All of the nested structures of alternatives
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1976	generated from derivatives are binary, and therefore
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1977	$\hflat{\_}$ and $\hflataux{\_}$ only deal with binary alternatives.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1978	$\hflat{\_}$ ``untangles'' like the following:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1979	\[
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1980	\sum ((r_1 + r_2) + (r_3 + r_4)) + \ldots \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1981	\stackrel{\hflat{\_}}{\longrightarrow} \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1982	\RALTS{[r_1, r_2, \ldots, r_n]}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1983	\]
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1984	Here is a lemma stating the recursive property of $\starupdate$ and $\starupdates$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1985	with the helpers $\hflat{\_}$ and $\hflataux{\_}$\footnote{The function $\textit{concat}$ takes a list of lists
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1986	and merges each of the element lists to form a flattened list.}:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1987	\begin{lemma}\label{stupdateInduct1}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1988	\mbox
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1989	For a list of strings $Ss$, the following hold.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1990	\begin{itemize}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1991	\item
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1992	If we do a derivative on the terms
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1993	$r\backslash_r s \cdot r^*$ (where $s$ is taken from the list $Ss$),
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1994	the result will be the same as if we apply $\starupdate$ to $Ss$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1995	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1996	\begin{tabular}{c}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1997	$\textit{concat} \; (\map \; (\hflataux{\_} \circ ( (\_\backslash_r x)
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1998	\circ (\lambda s.\;\; (r \backslash_r s) \cdot r^*)))\; Ss )\;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	1999	$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2000	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2001	$=$ \\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2002	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2003	$\map \; (\lambda s. (r \backslash_r s) \cdot (r^*)) \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2004	(\starupdate \; x \; r \; Ss)$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2005	\end{tabular}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2006	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2007	\item
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2008	$\starupdates$ is ``composable'' w.r.t a derivative.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2009	It piggybacks the character $x$ to the tail of the string
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2010	$xs$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2011	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2012	\begin{tabular}{c}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2013	$\textit{concat} \; (\map \; \hflataux{\_} \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2014	(\map \; (\_\backslash_r x) \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2015	(\map \; (\lambda s.\;\; (r \backslash_r s) \cdot
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2016	(r^*) ) \; (\starupdates \; xs \; r \; Ss))))$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2017	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2018	$=$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2019	\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2020	$\map \; (\lambda s.\;\; (r\backslash_r s) \cdot (r^*)) \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2021	(\starupdates \; (xs @ [x]) \; r \; Ss)$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2022	\end{tabular}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2023	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2024	\end{itemize}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2025	\end{lemma}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2026
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2027	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2028	Part 1 is by induction on $Ss$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2029	Part 2 is by induction on $xs$, where $Ss$ is left to take arbitrary values.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2030	\end{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2031
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2032
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2033	Like $\textit{createdBySequence}$, we need
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2034	a predicate for ``star-created'' regular expressions:
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2035	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2036	\begin{mathpar}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2037	\inferrule{\mbox{}}{ \textit{createdByStar}\; \RSEQ{ra}{\RSTAR{rb}} }
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2038
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2039	\inferrule{ \textit{createdByStar} \; r_1\; \land \; \textit{createdByStar} \; r_2 }{\textit{createdByStar} \; (r_1 + r_2) }
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2040	\end{mathpar}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2041	\end{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2042	\noindent
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2043	All regular expressions created by taking derivatives of
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2044	$r_1 \cdot (r_2)^*$ satisfy the $\textit{createdByStar}$ predicate:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2045	\begin{lemma}\label{starDersCbs}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2046	$\textit{createdByStar} \; ((r_1 \cdot r_2^*) \backslash_r s) $ holds.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2047	\end{lemma}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2048	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2049	By a reverse induction on $s$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2050	\end{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2051	If a regular expression conforms to the shape of a star's derivative,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2052	then we can push an application of $\hflataux{\_}$ inside a derivative of it:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2053	\begin{lemma}\label{hfauPushin}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2054	If $\textit{createdByStar} \; r$ holds, then
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2055	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2056	$\hflataux{r \backslash_r c} = \textit{concat} \; (
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2057	\map \; \hflataux{\_} (\map \; (\_\backslash_r c) \;(\hflataux{r})))$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2058	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2059	holds.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2060	\end{lemma}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2061	\begin{proof}
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2062	By an induction on the inductive cases of $\textit{createdByStar}$.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2063	\end{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2064	%This is not entirely true for annotated regular expressions:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2065	%%TODO: bsimp bders \neq bderssimp
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2066	%\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2067	% $(1+ (c\cdot \ASEQ{bs}{c^*}{c} ))$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2068	%\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2069	%For bit-codes, the order in which simplification is applied
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2070	%might cause a difference in the location they are placed.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2071	%If we want something like
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2072	%\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2073	% $\bderssimp{r}{s} \myequiv \bsimp{\bders{r}{s}}$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2074	%\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2075	%Some "canonicalization" procedure is required,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2076	%which either pushes all the common bitcodes to nodes
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2077	%as senior as possible:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2078	%\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2079	% $_{bs}(_{bs_1 @ bs'}r_1 + _{bs_1 @ bs''}r_2) \rightarrow _{bs @ bs_1}(_{bs'}r_1 + _{bs''}r_2) $
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2080	%\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2081	%or does the reverse. However bitcodes are not of interest if we are talking about
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2082	%the $\llbracket r \rrbracket$ size of a regex.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2083	%Therefore for the ease and simplicity of producing a
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2084	%proof for a size bound, we are happy to restrict ourselves to
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2085	%unannotated regular expressions, and obtain such equalities as
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2086	%TODO: rsimp sflat
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2087	% The simplification of a flattened out regular expression, provided it comes
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2088	%from the derivative of a star, is the same as the one nested.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2089
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2090
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2091
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2092	Now we introduce an inductive property
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2093	for $\starupdate$ and $\hflataux{\_}$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2094	\begin{lemma}\label{starHfauInduct}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2095	If we do derivatives of $r^*$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2096	with a string that starts with $c$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2097	then flatten it out,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2098	we obtain a list
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2099	of the shape $\sum_{s' \in sS} (r\backslash_r s') \cdot r^*$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2100	where $sS = \starupdates \; s \; r \; [[c]]$. Namely,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2101	\begin{center}
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2102	$\hflataux{(( (\rder{c}{r_0})\cdot(r_0^*))\backslash_{rs} s)} =
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2103	\map \; (\lambda s_1. (r_0 \backslash_r s_1) \cdot (r_0^*)) \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2104	(\starupdates \; s \; r_0 \; [[c]])$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2105	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2106	holds.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2107	\end{lemma}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2108	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2109	By an induction on $s$, the inductive cases
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2110	being $[]$ and $s@[c]$. The lemmas \ref{hfauPushin} and \ref{starDersCbs} are used.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2111	\end{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2112	\noindent
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2113
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2114	The function $\hflataux{\_}$ has a similar effect as $\textit{flatten}$:
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2115	\begin{lemma}\label{hflatauxGrewrites}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2116	$a :: rs \grewrites \hflataux{a} @ rs$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2117	\end{lemma}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2118	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2119	By induction on $a$. $rs$ is set to take arbitrary values.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2120	\end{proof}
638 dd9dde2d902b comments till chap4 Chengsong parents: 625 diff changeset	2121	It is also not surprising that
dd9dde2d902b comments till chap4 Chengsong parents: 625 diff changeset	2122	two regular expressions differing only in terms
dd9dde2d902b comments till chap4 Chengsong parents: 625 diff changeset	2123	of the
dd9dde2d902b comments till chap4 Chengsong parents: 625 diff changeset	2124	nesting of parentheses are equivalent w.r.t. $\textit{rsimp}$:
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2125	\begin{lemma}\label{cbsHfauRsimpeq1}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2126	$\rsimp{(r_1 + r_2)} = \rsimp{(\RALTS{\hflataux{r_1} @ \hflataux{r_2}})}$
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2127	\end{lemma}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2128
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2129	\begin{proof}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2130	By using the rewriting relation $\rightsquigarrow$
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2131	\end{proof}
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2132	And from this we obtain the following fact: a
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2133	regular expression created by star
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2134	is the same as its flattened version, up to equivalence under $\textit{bsimp}$.
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2135	For example,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2136	\begin{lemma}\label{hfauRsimpeq2}
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2137	$\textit{createdByStar} \; r \implies \rsimp{r} = \rsimp{\RALTS{\hflataux{r}}}$
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2138	\end{lemma}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2139	\begin{proof}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2140	By structural induction on $r$, where the induction rules
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2141	are these of $\createdByStar{\_}$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2142	Lemma \ref{cbsHfauRsimpeq1} is used in the inductive case.
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2143	\end{proof}
564 3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	2144
3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	2145
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2146	%Here is a corollary that states the lemma in
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2147	%a more intuitive way:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2148	%\begin{corollary}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2149	% $\hflataux{r^* \backslash_r (c::xs)} = \map \; (\lambda s. (r \backslash_r s) \cdot
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2150	% (r^*))\; (\starupdates \; c\; r\; [[c]])$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2151	%\end{corollary}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2152	%\noindent
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2153	%Note that this is also agnostic of the simplification
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2154	%function we defined, and is therefore of more general interest.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2155
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2156	Together with the rewriting relation
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2157	\begin{lemma}\label{starClosedForm6Hrewrites}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2158	We have the following set of rewriting relations or equalities:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2159	\begin{itemize}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2160	\item
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2161	$\textit{rsimp} \; (r^* \backslash_r (c::s))
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2162	\sequal
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2163	\sum \; ( ( \sum (\lambda s. (r\backslash_r s) \cdot r^*) \; (
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2164	\starupdates \; s \; r \; [ c::[]] ) ) )$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2165	\item
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2166	$r \backslash_{rsimps} (c::s) = \textit{rsimp} \; ( (
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2167	\sum ( (\map \; (\lambda s_1. (r\backslash s_1) \; r^*) \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2168	(\starupdates \;s \; r \; [ c::[] ])))))$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2169	\item
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2170	$\sum ( (\map \; (\lambda s. (r\backslash s) \; r^*) \; Ss))
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2171	\sequal
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2172	\sum ( (\map \; (\lambda s. \textit{rsimp} \; (r\backslash s) \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2173	r^*) \; Ss) )$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2174	\item
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2175	$\map \; (\lambda s. (\rsimp{r \backslash_r s}) \cdot (r^*)) \; Ss
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2176	\scfrewrites
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2177	\map \; (\lambda s. (\rsimp{r \backslash_r s}) \cdot (r^*)) \; Ss$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2178	\item
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2179	$( ( \sum ( ( \map \ (\lambda s. \;\;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2180	(\textit{rsimp} \; (r \backslash_r s)) \cdot r^*) \; (\starupdates \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2181	s \; r \; [ c::[] ])))))$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2182	$\sequal$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2183	$( ( \sum ( ( \map \ (\lambda s. \;\;
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2184	( r \backslash_{rsimps} s)) \cdot r^*) \; (\starupdates \;
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2185	s \; r \; [ c::[] ]))))$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2186	\end{itemize}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2187	\end{lemma}
671a83abccf3 haha Chengsong parents: 557 diff changeset	2188	\begin{proof}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2189	Part 1 leads to part 2.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2190	The rest of them are routine.
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2191	\end{proof}
671a83abccf3 haha Chengsong parents: 557 diff changeset	2192	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2193	Next the closed form for star regular expressions can be derived:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2194	\begin{theorem}\label{starClosedForm}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2195	$\rderssimp{r^*}{c::s} =
671a83abccf3 haha Chengsong parents: 557 diff changeset	2196	\rsimp{
671a83abccf3 haha Chengsong parents: 557 diff changeset	2197	(\sum (\map \; (\lambda s. (\rderssimp{r}{s})\cdot r^*) \;
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2198	(\starupdates \; s\; r \; [[c]])
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2199	)
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2200	)
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2201	}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2202	$
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2203	\end{theorem}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2204	\begin{proof}
671a83abccf3 haha Chengsong parents: 557 diff changeset	2205	By an induction on $s$.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2206	The lemmas \ref{rsimpIdem}, \ref{starHfauInduct}, \ref{starClosedForm6Hrewrites}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2207	and \ref{hfauRsimpeq2}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2208	are used.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2209	In \ref{starClosedForm6Hrewrites}, the equalities are
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2210	used to link the LHS and RHS.
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2211	\end{proof}
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2212
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2213
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2214
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2215
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2216
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2217
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2218	%----------------------------------------------------------------------------------------
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2219	% SECTION ??
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2220	%----------------------------------------------------------------------------------------
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2221
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2222	%-----------------------------------
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2223	% SECTION syntactic equivalence under simp
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2224	%-----------------------------------
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2225
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2226
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2227	%----------------------------------------------------------------------------------------
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2228	% SECTION ALTS CLOSED FORM
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2229	%----------------------------------------------------------------------------------------
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2230	%\section{A Closed Form for \textit{ALTS}}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2231	%Now we prove that $rsimp (rders\_simp (RALTS rs) s) = rsimp (RALTS (map (\lambda r. rders\_simp r s) rs))$.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2232	%
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2233	%
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2234	%There are a few key steps, one of these steps is
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2235	%
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2236	%
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2237	%
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2238	%One might want to prove this by something a simple statement like:
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2239	%
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2240	%For this to hold we want the $\textit{distinct}$ function to pick up
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2241	%the elements before and after derivatives correctly:
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2242	%$r \in rset \equiv (rder x r) \in (rder x rset)$.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2243	%which essentially requires that the function $\backslash$ is an injective mapping.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2244	%
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2245	%Unfortunately the function $\backslash c$ is not an injective mapping.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2246	%
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2247	%\subsection{function $\backslash c$ is not injective (1-to-1)}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2248	%\begin{center}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2249	% The derivative $w.r.t$ character $c$ is not one-to-one.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2250	% Formally,
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2251	% $\exists r_1 \;r_2. r_1 \neq r_2 \mathit{and} r_1 \backslash c = r_2 \backslash c$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2252	%\end{center}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2253	%This property is trivially true for the
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2254	%character regex example:
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2255	%\begin{center}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2256	% $r_1 = e; \; r_2 = d;\; r_1 \backslash c = \ZERO = r_2 \backslash c$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2257	%\end{center}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2258	%But apart from the cases where the derivative
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2259	%output is $\ZERO$, are there non-trivial results
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2260	%of derivatives which contain strings?
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2261	%The answer is yes.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2262	%For example,
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2263	%\begin{center}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2264	% Let $r_1 = a^b\;\quad r_2 = (a\cdot a^)\cdot b + b$.\\
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2265	% where $a$ is not nullable.\\
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2266	% $r_1 \backslash c = ((a \backslash c)\cdot a^*)\cdot c + b \backslash c$\\
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2267	% $r_2 \backslash c = ((a \backslash c)\cdot a^*)\cdot c + b \backslash c$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2268	%\end{center}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2269	%We start with two syntactically different regular expressions,
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2270	%and end up with the same derivative result.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2271	%This is not surprising as we have such
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2272	%equality as below in the style of Arden's lemma:\\
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2273	%\begin{center}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2274	% $L(A^B) = L(A\cdot A^ \cdot B + B)$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2275	%\end{center}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2276	\section{Bounding Closed Forms}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2277
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2278	In this section, we introduce how we formalised the bound
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2279	on closed forms.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2280	We first show that in general the number of regular expressions up to a certain
bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2281	size is finite.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2282	Then we prove that functions such as $\rflts$
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2283	will not cause the size of r-regular expressions to grow.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2284	Putting this together with a general bound
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2285	on the finiteness of distinct regular expressions
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2286	up to a specific size, we obtain a bound on
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2287	the closed forms.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2288
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2289	\subsection{Finiteness of Distinct Regular Expressions}
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2290	We define the set of regular expressions whose size is no more than
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2291	a certain size $N$ as $\textit{sizeNregex} \; N$:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2292	\[
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2293	\textit{sizeNregex} \; N \dn \{r\; \mid \; \llbracket r \rrbracket_r \leq N \}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2294	\]
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2295	We have that $\textit{sizeNregex} \; N$ is always a finite set:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2296	\begin{lemma}\label{finiteSizeN}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2297	$\textit{finite} \; (\textit{sizeNregex} \; N)$ holds.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2298	\end{lemma}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2299	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2300	By splitting the set $\textit{sizeNregex} \; (N + 1)$ into
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2301	subsets by their categories:
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2302	$\{\ZERO_r, \ONE_r, c\}$, $\{r^* \mid r \in \textit{sizeNregex} \; N\}$,
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2303	and so on. Each of these subsets is finitely bounded.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2304	\end{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2305	\noindent
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2306	From this we get a corollary that
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2307	if forall $r \in rs$, $\rsize{r} \leq N$, then the output of
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2308	$\rdistinct{rs}{\varnothing}$ is a list of regular
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2309	expressions of finite size depending on $N$ only.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2310	\begin{corollary}\label{finiteSizeNCorollary}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2311	$\rsize{\rdistinct{rs}{\varnothing}} \leq c_N * N$ holds,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2312	where the constant $c_N$ is equal to $\textit{card} \; (\textit{sizeNregex} \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2313	N)$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2314	\end{corollary}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2315	\begin{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2316	For all $r$ in
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2317	$\textit{set} \; (\rdistinct{rs}{\varnothing})$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2318	it is always the case that $\rsize{r} \leq N$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2319	In addition, the list length is bounded by
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2320	$c_N$, yielding the desired bound.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2321	\end{proof}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2322	\noindent
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2323	This fact will be handy in estimating the closed form sizes.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2324	%We have proven that the size of the
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2325	%output of $\textit{rdistinct} \; rs' \; \varnothing$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2326	%is bounded by a constant $N * c_N$ depending only on $N$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2327	%provided that each of $rs'$'s element
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2328	%is bounded by $N$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2329
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2330	\subsection{$\textit{rsimp}$ Does Not Increase the Size}
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2331	Although it seems evident, we need a series
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2332	of non-trivial lemmas to establish that functions such as $\rflts$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2333	do not cause the regular expressions to grow.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2334	\begin{lemma}\label{rsimpMonoLemmas}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2335	\mbox{}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2336	\begin{itemize}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2337	\item
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2338	\[
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2339	\llbracket \rsimpalts \; rs \rrbracket_r \leq
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2340	\llbracket \sum \; rs \rrbracket_r
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2341	\]
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2342	\item
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2343	\[
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2344	\llbracket \rsimpseq \; r_1 \; r_2 \rrbracket_r \leq
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2345	\llbracket r_1 \cdot r_2 \rrbracket_r
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2346	\]
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2347	\item
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2348	\[
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2349	\llbracket \rflts \; rs \rrbracket_r \leq
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2350	\llbracket rs \rrbracket_r
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2351	\]
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2352	\item
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2353	\[
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2354	\llbracket \rDistinct \; rs \; ss \rrbracket_r \leq
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2355	\llbracket rs \rrbracket_r
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2356	\]
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2357	\item
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2358	If all elements $a$ in the set $as$ satisfy the property
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2359	that $\llbracket \textit{rsimp} \; a \rrbracket_r \leq
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2360	\llbracket a \rrbracket_r$, then we have
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2361	\[
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2362	\llbracket \; \rsimpalts \; (\textit{rdistinct} \;
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2363	(\textit{rflts} \; (\textit{map}\;\textit{rsimp} as)) \{\})
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2364	\rrbracket \leq
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2365	\llbracket \; \sum \; (\rDistinct \; (\rflts \;(\map \;
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2366	\textit{rsimp} \; x))\; \{ \} ) \rrbracket_r
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2367	\]
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2368	\end{itemize}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2369	\end{lemma}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2370	\begin{proof}
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2371	Points 1, 3, and 4 can be proven by an induction on $rs$.
613 b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2372	Point 2 is by case analysis on $r_1$ and $r_2$.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2373	The last part is a corollary of the previous ones.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2374	\end{proof}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2375	\noindent
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2376	With the lemmas for each inductive case in place, we are ready to get
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2377	the non-increasing property as a corollary:
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2378	\begin{corollary}\label{rsimpMono}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2379	$\llbracket \textit{rsimp} \; r \rrbracket_r \leq \llbracket r \rrbracket_r$
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2380	\end{corollary}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2381	\begin{proof}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2382	By \ref{rsimpMonoLemmas}.
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2383	\end{proof}
b0f0d884a547 chap5 Chengsong parents: 611 diff changeset	2384
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2385	\subsection{Estimating the Closed Forms' sizes}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2386	We recap the closed forms we obtained
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2387	earlier:
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2388	\begin{itemize}
671a83abccf3 haha Chengsong parents: 557 diff changeset	2389	\item
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2390	$\rderssimp{(\sum rs)}{s} \sequal
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2391	\sum \; (\map \; (\rderssimp{\_}{s}) \; rs)$
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2392	\item
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2393	$\rderssimp{(r_1 \cdot r_2)}{s} \sequal \sum ((r_1 \backslash s) \cdot r_2 )
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2394	:: (\map \; (r_2 \backslash \_) (\vsuf{s}{r_1}))$
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2395	\item
671a83abccf3 haha Chengsong parents: 557 diff changeset	2396
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2397	$\rderssimp{r^*}{c::s} =
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2398	\rsimp{
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2399	(\sum (\map \; (\lambda s. (\rderssimp{r}{s})\cdot r^*) \;
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2400	(\starupdates \; s\; r \; [[c]])
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2401	)
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2402	)
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2403	}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2404	$
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2405	\end{itemize}
671a83abccf3 haha Chengsong parents: 557 diff changeset	2406	\noindent
671a83abccf3 haha Chengsong parents: 557 diff changeset	2407	The closed forms on the left-hand-side
671a83abccf3 haha Chengsong parents: 557 diff changeset	2408	are all of the same shape: $\rsimp{ (\sum rs)} $.
671a83abccf3 haha Chengsong parents: 557 diff changeset	2409	Such regular expression will be bounded by the size of $\sum rs'$,
671a83abccf3 haha Chengsong parents: 557 diff changeset	2410	where every element in $rs'$ is distinct, and each element
671a83abccf3 haha Chengsong parents: 557 diff changeset	2411	can be described by some inductive sub-structures
671a83abccf3 haha Chengsong parents: 557 diff changeset	2412	(for example when $r = r_1 \cdot r_2$ then $rs'$
671a83abccf3 haha Chengsong parents: 557 diff changeset	2413	will be solely comprised of $r_1 \backslash s'$
671a83abccf3 haha Chengsong parents: 557 diff changeset	2414	and $r_2 \backslash s''$, $s'$ and $s''$ being
671a83abccf3 haha Chengsong parents: 557 diff changeset	2415	sub-strings of $s$).
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2416	which will each have a size upper bound
bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2417	according to the inductive hypothesis, which controls $r \backslash s$.
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	2418
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2419	We elaborate the above reasoning by a series of lemmas
671a83abccf3 haha Chengsong parents: 557 diff changeset	2420	below, where straightforward proofs are omitted.
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2421	%We want to apply it to our setting $\rsize{\rsimp{\sum rs}}$.
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2422	We show that $\textit{rdistinct}$ and $\rflts$
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2423	working together is at least as
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2424	good as $\textit{rdistinct}$ alone, which can be written as
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2425	\begin{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2426	$\llbracket \rdistinct{(\rflts \; \textit{rs})}{\varnothing} \rrbracket_r
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2427	\leq
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2428	\llbracket \rdistinct{rs}{\varnothing} \rrbracket_r $.
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2429	\end{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2430	We need this so that we know the outcome of our real
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2431	simplification is better than or equal to a rough estimate,
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2432	and therefore can be bounded by that estimate.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2433	This is a bit harder to establish compared to proving
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2434	$\textit{flts}$ does not make a list larger (which can
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2435	be proven using routine induction):
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2436	\begin{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2437	$\llbracket \textit{rflts}\; rs \rrbracket_r \leq
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2438	\llbracket \textit{rs} \rrbracket_r$
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2439	\end{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2440	We cannot simply prove how each helper function
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2441	reduces the size and then put them together:
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2442	From
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2443	\begin{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2444	$\llbracket \textit{rflts}\; rs \rrbracket_r \leq
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2445	\llbracket \textit{rs} \rrbracket_r$
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2446	\end{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2447	and
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2448	\begin{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2449	$\llbracket \textit{rdistinct} \; rs \; \varnothing \leq
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2450	\llbracket rs \rrbracket_r$
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2451	\end{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2452	one cannot infer
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2453	\begin{center}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2454	$\llbracket \rdistinct{(\rflts \; \textit{rs})}{\varnothing} \rrbracket_r
671a83abccf3 haha Chengsong parents: 557 diff changeset	2455	\leq
671a83abccf3 haha Chengsong parents: 557 diff changeset	2456	\llbracket \rdistinct{rs}{\varnothing} \rrbracket_r $.
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2457	\end{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2458	What we can infer is that
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2459	\begin{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2460	$\llbracket \rdistinct{(\rflts \; \textit{rs})}{\varnothing} \rrbracket_r
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2461	\leq
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2462	\llbracket rs \rrbracket_r$
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2463	\end{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2464	but this estimate is too rough and $\llbracket rs \rrbracket_r$ is unbounded.
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2465	The way we
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2466	get around this is by first proving a more general lemma
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2467	(so that the inductive case goes through):
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2468	\begin{lemma}\label{fltsSizeReductionAlts}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2469	If we have three accumulator sets:
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2470	$noalts\_set$, $alts\_set$ and $corr\_set$,
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2471	satisfying:
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2472	\begin{itemize}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2473	\item
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2474	$\forall r \in noalts\_set. \; \nexists xs.\; r = \sum xs$
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2475	\item
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2476	$\forall r \in alts\_set. \; \exists xs. \; r = \sum xs
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2477	\; \textit{and} \; set \; xs \subseteq corr\_set$
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2478	\end{itemize}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2479	then we have that
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2480	\begin{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2481	\begin{tabular}{lcl}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2482	$\llbracket (\textit{rdistinct} \; (\textit{rflts} \; as) \;
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2483	(noalts\_set \cup corr\_set)) \rrbracket_r$ & $\leq$ &\\
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2484	$\llbracket (\textit{rdistinct} \; as \; (noalts\_set \cup alts\_set \cup
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2485	\{ \ZERO_r \} )) \rrbracket_r$ & & \\
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2486	\end{tabular}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2487	\end{center}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2488	holds.
532 cc54ce075db5 restructured Chengsong parents: diff changeset	2489	\end{lemma}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2490	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2491	We split the accumulator into two parts: the part
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2492	which contains alternative regular expressions ($alts\_set$), and
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2493	the part without any of them($noalts\_set$).
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2494	This is because $\rflts$ opens up the alternatives in $as$,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2495	causing the accumulators on both sides of the inequality
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2496	to diverge slightly.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2497	If we want to compare the accumulators that are not
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2498	perfectly in sync, we need to consider the alternatives and non-alternatives
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2499	separately.
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2500	The set $corr\_set$ is the corresponding set
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2501	of $alts\_set$ with all elements under the alternative constructor
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2502	spilled out.
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2503	\begin{proof}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2504	By induction on the list $as$. We make use of lemma \ref{rdistinctConcat}.
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2505	\end{proof}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2506	By setting all three sets to the empty set, one gets the desired size estimate:
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2507	\begin{corollary}\label{interactionFltsDB}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2508	$\llbracket \rdistinct{(\rflts \; \textit{rs})}{\varnothing} \rrbracket_r
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2509	\leq
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2510	\llbracket \rdistinct{rs}{\varnothing} \rrbracket_r $.
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2511	\end{corollary}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2512	\begin{proof}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2513	By using the lemma \ref{fltsSizeReductionAlts}.
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2514	\end{proof}
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2515	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2516	The intuition for why this is true
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2517	is that if we remove duplicates from the $\textit{LHS}$, at least the same amount of
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2518	duplicates will be removed from the list $\textit{rs}$ in the $\textit{RHS}$.
671a83abccf3 haha Chengsong parents: 557 diff changeset	2519
671a83abccf3 haha Chengsong parents: 557 diff changeset	2520	Now this $\rsimp{\sum rs}$ can be estimated using $\rdistinct{rs}{\varnothing}$:
671a83abccf3 haha Chengsong parents: 557 diff changeset	2521	\begin{lemma}\label{altsSimpControl}
671a83abccf3 haha Chengsong parents: 557 diff changeset	2522	$\rsize{\rsimp{\sum rs}} \leq \rsize{\rdistinct{rs}{\varnothing}}+ 1$
532 cc54ce075db5 restructured Chengsong parents: diff changeset	2523	\end{lemma}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2524	\begin{proof}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2525	By using corollary \ref{interactionFltsDB}.
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2526	\end{proof}
671a83abccf3 haha Chengsong parents: 557 diff changeset	2527	\noindent
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2528	This is a key lemma in establishing the bounds of all the
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2529	closed forms.
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2530	With this we are now ready to control the sizes of
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2531	$(r_1 \cdot r_2 )\backslash s$ and $r^* \backslash s$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2532	\begin{theorem}\label{rBound}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2533	For any regex $r$, $\exists N_r. \forall s. \; \rsize{\rderssimp{r}{s}} \leq N_r$
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2534	\end{theorem}
671a83abccf3 haha Chengsong parents: 557 diff changeset	2535	\noindent
671a83abccf3 haha Chengsong parents: 557 diff changeset	2536	\begin{proof}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2537	We prove this by induction on $r$. The base cases for $\RZERO$,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2538	$\RONE $ and $\RCHAR{c}$ are straightforward.
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2539	In the sequence $r_1 \cdot r_2$ case,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2540	the inductive hypotheses state
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2541	$\exists N_1. \forall s. \; \llbracket \rderssimp{r}{s} \rrbracket \leq N_1$ and
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2542	$\exists N_2. \forall s. \; \llbracket \rderssimp{r_2}{s} \rrbracket \leq N_2$.
562 57e33978e55d more Chengsong parents: 561 diff changeset	2543
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2544	When the string $s$ is not empty, we can reason as follows
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2545	%
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2546	\begin{center}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2547	\begin{tabular}{lcll}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2548	& & $ \llbracket \rderssimp{r_1\cdot r_2 }{s} \rrbracket_r $\\
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2549	& $ = $ & $\llbracket \rsimp{(\sum(r_1 \backslash_{rsimps} s \cdot r_2 \; \; :: \; \;
ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2550	\map \; (r_2\backslash_{rsimps} \_)\; (\vsuf{s}{r})))} \rrbracket_r $ & (1) \\
ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2551	& $\leq$ & $\llbracket \rdistinct{(r_1 \backslash_{rsimps} s \cdot r_2 \; \; :: \; \;
ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2552	\map \; (r_2\backslash_{rsimps} \_)\; (\vsuf{s}{r}))}{\varnothing} \rrbracket_r + 1$ & (2) \\
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2553	& $\leq$ & $2 + N_1 + \rsize{r_2} + (N_2 * (card\;(\sizeNregex \; N_2)))$ & (3)\\
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2554	\end{tabular}
671a83abccf3 haha Chengsong parents: 557 diff changeset	2555	\end{center}
561 486fb297ac7c more done Chengsong parents: 559 diff changeset	2556	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2557	(1) is by theorem \ref{seqClosedForm}.
561 486fb297ac7c more done Chengsong parents: 559 diff changeset	2558	(2) is by \ref{altsSimpControl}.
486fb297ac7c more done Chengsong parents: 559 diff changeset	2559	(3) is by \ref{finiteSizeNCorollary}.
562 57e33978e55d more Chengsong parents: 561 diff changeset	2560
57e33978e55d more Chengsong parents: 561 diff changeset	2561
57e33978e55d more Chengsong parents: 561 diff changeset	2562	Combining the cases when $s = []$ and $s \neq []$, we get (4):
57e33978e55d more Chengsong parents: 561 diff changeset	2563	\begin{center}
57e33978e55d more Chengsong parents: 561 diff changeset	2564	\begin{tabular}{lcll}
57e33978e55d more Chengsong parents: 561 diff changeset	2565	$\rsize{(r_1 \cdot r_2) \backslash_r s}$ & $\leq$ &
57e33978e55d more Chengsong parents: 561 diff changeset	2566	$max \; (2 + N_1 +
57e33978e55d more Chengsong parents: 561 diff changeset	2567	\llbracket r_2 \rrbracket_r +
57e33978e55d more Chengsong parents: 561 diff changeset	2568	N_2 * (card\; (\sizeNregex \; N_2))) \; \rsize{r_1\cdot r_2}$ & (4)
57e33978e55d more Chengsong parents: 561 diff changeset	2569	\end{tabular}
57e33978e55d more Chengsong parents: 561 diff changeset	2570	\end{center}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2571
562 57e33978e55d more Chengsong parents: 561 diff changeset	2572	We reason similarly for $\STAR$.
57e33978e55d more Chengsong parents: 561 diff changeset	2573	The inductive hypothesis is
57e33978e55d more Chengsong parents: 561 diff changeset	2574	$\exists N. \forall s. \; \llbracket \rderssimp{r}{s} \rrbracket \leq N$.
564 3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	2575	Let $n_r = \llbracket r^* \rrbracket_r$.
562 57e33978e55d more Chengsong parents: 561 diff changeset	2576	When $s = c :: cs$ is not empty,
57e33978e55d more Chengsong parents: 561 diff changeset	2577	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2578	\begin{tabular}{lcll}
562 57e33978e55d more Chengsong parents: 561 diff changeset	2579	& & $ \llbracket \rderssimp{r^* }{c::cs} \rrbracket_r $\\
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2580	& $ = $ & $\llbracket \rsimp{(\sum (\map \; (\lambda s. (r \backslash_{rsimps} s) \cdot r^*) \; (\starupdates\;
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2581	cs \; r \; [[c]] )) )} \rrbracket_r $ & (5) \\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2582	& $\leq$ & $\llbracket
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2583	\rdistinct{
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2584	(\map \;
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2585	(\lambda s. (r \backslash_{rsimps} s) \cdot r^*) \;
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2586	(\starupdates\; cs \; r \; [[c]] )
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2587	)}
562 57e33978e55d more Chengsong parents: 561 diff changeset	2588	{\varnothing} \rrbracket_r + 1$ & (6) \\
57e33978e55d more Chengsong parents: 561 diff changeset	2589	& $\leq$ & $1 + (\textit{card} (\sizeNregex \; (N + n_r)))
57e33978e55d more Chengsong parents: 561 diff changeset	2590	* (1 + (N + n_r)) $ & (7)\\
57e33978e55d more Chengsong parents: 561 diff changeset	2591	\end{tabular}
57e33978e55d more Chengsong parents: 561 diff changeset	2592	\end{center}
57e33978e55d more Chengsong parents: 561 diff changeset	2593	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2594	(5) is by theorem \ref{starClosedForm}.
562 57e33978e55d more Chengsong parents: 561 diff changeset	2595	(6) is by \ref{altsSimpControl}.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2596	(7) is by corollary \ref{finiteSizeNCorollary}.
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2597	Combining with the case when $s = []$, one obtains
562 57e33978e55d more Chengsong parents: 561 diff changeset	2598	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2599	\begin{tabular}{lcll}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2600	$\rsize{r^* \backslash_r s}$ & $\leq$ & $max \; n_r \; 1 + (\textit{card} (\sizeNregex \; (N + n_r)))
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2601	* (1 + (N + n_r)) $ & (8)\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2602	\end{tabular}
562 57e33978e55d more Chengsong parents: 561 diff changeset	2603	\end{center}
57e33978e55d more Chengsong parents: 561 diff changeset	2604	\noindent
57e33978e55d more Chengsong parents: 561 diff changeset	2605
57e33978e55d more Chengsong parents: 561 diff changeset	2606	The alternative case is slightly less involved.
57e33978e55d more Chengsong parents: 561 diff changeset	2607	The inductive hypothesis
57e33978e55d more Chengsong parents: 561 diff changeset	2608	is equivalent to $\exists N. \forall r \in (\map \; (\_ \backslash_r s) \; rs). \rsize{r} \leq N$.
57e33978e55d more Chengsong parents: 561 diff changeset	2609	In the case when $s = c::cs$, we have
57e33978e55d more Chengsong parents: 561 diff changeset	2610	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2611	\begin{tabular}{lcll}
562 57e33978e55d more Chengsong parents: 561 diff changeset	2612	& & $ \llbracket \rderssimp{\sum rs }{c::cs} \rrbracket_r $\\
620 ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2613	& $ = $ & $\llbracket \rsimp{(\sum (\map \; (\_ \backslash_{rsimps} s) \; rs) )} \rrbracket_r $ & (9) \\
ae6010c14e49 chap6 almost done Chengsong parents: 618 diff changeset	2614	& $\leq$ & $\llbracket (\sum (\map \; (\_ \backslash_{rsimps} s) \; rs) ) \rrbracket_r $ & (10) \\
562 57e33978e55d more Chengsong parents: 561 diff changeset	2615	& $\leq$ & $1 + N * (length \; rs) $ & (11)\\
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2616	\end{tabular}
562 57e33978e55d more Chengsong parents: 561 diff changeset	2617	\end{center}
57e33978e55d more Chengsong parents: 561 diff changeset	2618	\noindent
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2619	(9) is by theorem \ref{altsClosedForm}, (10) by lemma \ref{rsimpMono} and (11) by inductive hypothesis.
562 57e33978e55d more Chengsong parents: 561 diff changeset	2620
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2621	Combining with the case when $s = []$, we obtain
562 57e33978e55d more Chengsong parents: 561 diff changeset	2622	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2623	\begin{tabular}{lcll}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2624	$\rsize{\sum rs \backslash_r s}$ & $\leq$ & $max \; \rsize{\sum rs} \; 1+N*(length \; rs)$
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2625	& (12)\\
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	2626	\end{tabular}
562 57e33978e55d more Chengsong parents: 561 diff changeset	2627	\end{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2628	We have all the inductive cases proven.
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2629	\end{proof}
671a83abccf3 haha Chengsong parents: 557 diff changeset	2630
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2631	This leads to our main result on the size bound:
564 3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	2632	\begin{corollary}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2633	For any annotated regular expression $a$, $\exists N_r. \forall s. \; \rsize{\bderssimp{a}{s}} \leq N_r$
564 3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	2634	\end{corollary}
3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	2635	\begin{proof}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	2636	By lemma \ref{sizeRelations} and theorem \ref{rBound}.
564 3cbcd7cda0a9 more Chengsong parents: 562 diff changeset	2637	\end{proof}
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2638	\noindent
671a83abccf3 haha Chengsong parents: 557 diff changeset	2639
609 61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2640
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2641
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2642
61139fdddae0 chap1 totally done Chengsong parents: 601 diff changeset	2643
558 671a83abccf3 haha Chengsong parents: 557 diff changeset	2644	%-----------------------------------
671a83abccf3 haha Chengsong parents: 557 diff changeset	2645	% SECTION 2
671a83abccf3 haha Chengsong parents: 557 diff changeset	2646	%-----------------------------------
671a83abccf3 haha Chengsong parents: 557 diff changeset	2647
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2648	\section{Bounded Repetitions}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2649	We have promised in chapter \ref{Introduction}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2650	that our lexing algorithm can potentially be extended
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2651	to handle bounded repetitions
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2652	in natural and elegant ways.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2653	Now we fulfill our promise by adding support for
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2654	the ``exactly-$n$-times'' bounded regular expression $r^{\{n\}}$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2655	We add clauses in our derivatives-based lexing algorithms (with simplifications)
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2656	introduced in chapter \ref{Bitcoded2}.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2657
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2658	\subsection{Augmented Definitions}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2659	There are a number of definitions that need to be augmented.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2660	The most notable one would be the POSIX rules for $r^{\{n\}}$:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2661	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2662	\begin{mathpar}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2663	\inferrule{\forall v \in vs_1. \vdash v:r \land
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2664	\|v\| \neq []\\ \forall v \in vs_2. \vdash v:r \land \|v\| = []\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2665	\textit{length} \; (vs_1 @ vs_2) = n}{\textit{Stars} \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2666	(vs_1 @ vs_2) : r^{\{n\}} }
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2667	\end{mathpar}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2668	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2669	As Ausaf had pointed out \cite{Ausaf},
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2670	sometimes empty iterations have to be taken to get
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2671	a match with exactly $n$ repetitions,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2672	and hence the $vs_2$ part.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2673
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2674	Another important definition would be the size:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2675	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2676	\begin{tabular}{lcl}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2677	$\llbracket r^{\{n\}} \rrbracket_r$ & $\dn$ &
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2678	$\llbracket r \rrbracket_r + n$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2679	\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2680	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2681	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2682	Arguably we should use $\log \; n$ for the size because
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2683	the number of digits increases logarithmically w.r.t $n$.
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2684	For simplicity we choose to add the counter directly to the size.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2685
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2686	The derivative w.r.t a bounded regular expression
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2687	is given as
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2688	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2689	\begin{tabular}{lcl}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2690	$r^{\{n\}} \backslash_r c$ & $\dn$ &
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2691	$r\backslash_r c \cdot r^{\{n-1\}} \;\; \textit{if} \; n \geq 1$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2692	& & $\RZERO \;\quad \quad\quad \quad
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2693	\textit{otherwise}$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2694	\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2695	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2696	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2697	For brevity, we sometimes use NTIMES to refer to bounded
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2698	regular expressions.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2699	The $\mkeps$ function clause for NTIMES would be
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2700	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2701	\begin{tabular}{lcl}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2702	$\mkeps \; r^{\{n\}} $ & $\dn$ & $\Stars \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2703	(\textit{replicate} \; n\; (\mkeps \; r))$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2704	\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2705	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2706	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2707	The injection looks like
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2708	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2709	\begin{tabular}{lcl}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2710	$\inj \; r^{\{n\}} \; c\; (\Seq \;v \; (\Stars \; vs)) $ &
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2711	$\dn$ & $\Stars \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2712	((\inj \; r \;c \;v ) :: vs)$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2713	\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2714	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2715	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2716
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2717
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2718	\subsection{Proofs for the Augmented Lexing Algorithm}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2719	We need to maintain two proofs with the additional $r^{\{n\}}$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2720	construct: the
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2721	correctness proof in chapter \ref{Bitcoded2},
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2722	and the finiteness proof in chapter \ref{Finite}.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2723
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2724	\subsubsection{Correctness Proof Augmentation}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2725	The correctness of $\textit{lexer}$ and $\textit{blexer}$ with bounded repetitions
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2726	have been proven by Ausaf and Urban\cite{AusafDyckhoffUrban2016}.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2727	As they have commented, once the definitions are in place,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2728	the proofs given for the basic regular expressions will extend to
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2729	bounded regular expressions, and there are no ``surprises''.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2730	We confirm this point because the correctness theorem would also
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2731	extend without surprise to $\blexersimp$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2732	The rewrite rules such as $\rightsquigarrow$, $\stackrel{s}{\rightsquigarrow}$ and so on
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2733	do not need to be changed,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2734	and only a few lemmas such as lemma \ref{fltsPreserves} need to be adjusted to
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2735	add one more line which can be solved by the Sledgehammer tool
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2736	to solve the $r^{\{n\}}$ inductive case.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2737
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2738
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2739	\subsubsection{Finiteness Proof Augmentation}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2740	The bounded repetitions are
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2741	very similar to stars, and therefore the treatment
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2742	is similar, with minor changes to handle some slight complications
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2743	when the counter reaches 0.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2744	The exponential growth is similar:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2745	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2746	\begin{tabular}{ll}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2747	$r^{\{n\}} $ & $\longrightarrow_{\backslash c}$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2748	$(r\backslash c) \cdot
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2749	r^{\{n - 1\}}*$ & $\longrightarrow_{\backslash c'}$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2750	\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2751	$r \backslash cc' \cdot r^{\{n - 2\}}* +
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2752	r \backslash c' \cdot r^{\{n - 1\}}*$ &
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2753	$\longrightarrow_{\backslash c''}$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2754	\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2755	$(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* +
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2756	r \backslash c''\cdot r^{\{n-1\}}) +
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2757	(r \backslash c'c'' \cdot r^{\{n-2\}}* +
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2758	r \backslash c'' \cdot r^{\{n-1\}}*)$ &
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2759	$\longrightarrow_{\backslash c'''}$ \\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2760	\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2761	$\ldots$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2762	\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2763	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2764	Again, we assume that $r\backslash c$, $r \backslash cc'$ and so on
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2765	are all nullable.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2766	The flattened list of terms for $r^{\{n\}} \backslash_{rs} s$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2767	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2768	$[r_1 \backslash cc'c'' \cdot r^{\{n-3\}}*,\;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2769	r \backslash c''\cdot r^{\{n-1\}}, \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2770	r \backslash c'c'' \cdot r^{\{n-2\}}*, \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2771	r \backslash c'' \cdot r^{\{n-1\}}*,\; \ldots ]$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2772	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2773	that comes from
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2774	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2775	$(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* +
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2776	r \backslash c''\cdot r^{\{n-1\}}) +
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2777	(r \backslash c'c'' \cdot r^{\{n-2\}}* +
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2778	r \backslash c'' \cdot r^{\{n-1\}}*)+ \ldots$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2779	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2780	are made of sequences with different tails, where the counters
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2781	might differ.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2782	The observation for maintaining the bound is that
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2783	these counters never exceed $n$, the original
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2784	counter. With the number of counters staying finite,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2785	$\rDistinct$ will deduplicate and keep the list finite.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2786	We introduce this idea as a lemma once we describe all
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2787	the necessary helper functions.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2788
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2789	Similar to the star case, we want
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2790	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2791	$\rderssimp{r^{\{n\}}}{s} = \rsimp{\sum rs}$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2792	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2793	where $rs$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2794	shall be in the form of
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2795	$\map \; f \; Ss$, where $f$ is a function and
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2796	$Ss$ a list of objects to act on.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2797	For star, the object's datatype is string.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2798	The list of strings $Ss$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2799	is generated using functions
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2800	$\starupdate$ and $\starupdates$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2801	The function that takes a string and returns a regular expression
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2802	is the anonymous function $
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2803	(\lambda s'. \; r\backslash s' \cdot r^{\{m\}})$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2804	In the NTIMES setting,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2805	the $\starupdate$ and $\starupdates$ functions are replaced by
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2806	$\textit{nupdate}$ and $\textit{nupdates}$:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2807	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2808	\begin{tabular}{lcl}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2809	$\nupdate \; c \; r \; [] $ & $\dn$ & $[]$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2810	$\nupdate \; c \; r \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2811	(\Some \; (s, \; n + 1) \; :: \; Ss)$ & $\dn$ & %\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2812	$\textit{if} \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2813	(\rnullable \; (r \backslash_{rs} s))$ \\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2814	& & $\;\;\textit{then}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2815	\;\; \Some \; (s @ [c], n + 1) :: \Some \; ([c], n) :: (
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2816	\nupdate \; c \; r \; Ss)$ \\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2817	& & $\textit{else} \;\; \Some \; (s @ [c], n+1) :: (
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2818	\nupdate \; c \; r \; Ss)$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2819	$\nupdate \; c \; r \; (\textit{None} :: Ss)$ & $\dn$ &
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2820	$(\None :: \nupdate \; c \; r \; Ss)$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2821	& & \\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2822	%\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2823	%\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2824	%\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2825	%\begin{tabular}{lcl}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2826	$\nupdates \; [] \; r \; Ss$ & $\dn$ & $Ss$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2827	$\nupdates \; (c :: cs) \; r \; Ss$ & $\dn$ & $\nupdates \; cs \; r \; (
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2828	\nupdate \; c \; r \; Ss)$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2829	\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2830	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2831	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2832	which take into account when a subterm
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2833	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2834	$r \backslash_s s \cdot r^{\{n\}}$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2835	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2836	counter $n$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2837	is 0, and therefore expands to
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2838	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2839	$r \backslash_s (s@[c]) \cdot r^{\{n\}} \;+
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2840	\; \ZERO$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2841	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2842	after taking a derivative.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2843	The object now has type
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2844	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2845	$\textit{option} \;(\textit{string}, \textit{nat})$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2846	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2847	and therefore the function for converting such an option into
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2848	a regular expression term is called $\opterm$:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2849
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2850	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2851	\begin{tabular}{lcl}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2852	$\opterm \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2853	& & $\;\;\Some \; (s, n) \Rightarrow
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2854	(r\backslash_{rs} s)\cdot r^{\{n\}}$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2855	& & $\;\;\None \Rightarrow
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2856	\ZERO$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2857	\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2858	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2859	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2860	Put together, the list $\map \; f \; Ss$ is instantiated as
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2861	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2862	$\map \; (\opterm \; r) \; (\nupdates \; s \; r \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2863	[\Some \; ([c], n)])$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2864	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2865	For the closed form to be bounded, we would like
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2866	simplification to be applied to each term in the list.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2867	Therefore we introduce some variants of $\opterm$,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2868	which help conveniently express the rewriting steps
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2869	needed in the closed form proof.
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2870	We have $\optermOsimp$, $\optermosimp$ and $\optermsimp$
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	2871	with slightly different spellings because they help the proof to go through:
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2872	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2873	\begin{tabular}{lcl}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2874	$\optermOsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2875	& & $\;\;\Some \; (s, n) \Rightarrow
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2876	\textit{rsimp} \; ((r\backslash_{rs} s)\cdot r^{\{n\}})$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2877	& & $\;\;\None \Rightarrow
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2878	\ZERO$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2879	\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2880	$\optermosimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2881	& & $\;\;\Some \; (s, n) \Rightarrow
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2882	(\textit{rsimp} \; (r\backslash_{rs} s))
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2883	\cdot r^{\{n\}}$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2884	& & $\;\;\None \Rightarrow
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2885	\ZERO$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2886	\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2887	$\optermsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2888	& & $\;\;\Some \; (s, n) \Rightarrow
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2889	(r\backslash_{rsimps} s)\cdot r^{\{n\}}$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2890	& & $\;\;\None \Rightarrow
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2891	\ZERO$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2892	\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2893	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2894
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2895
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2896	For a list of
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2897	$\textit{option} \;(\textit{string}, \textit{nat})$ elements,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2898	we define the highest power for it recursively:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2899	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2900	\begin{tabular}{lcl}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2901	$\hpa \; [] \; n $ & $\dn$ & $n$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2902	$\hpa \; (\None :: os) \; n $ & $\dn$ & $\hpa \; os \; n$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2903	$\hpa \; (\Some \; (s, n) :: os) \; m$ & $\dn$ &
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2904	$\hpa \;os \; (\textit{max} \; n\; m)$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2905	\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2906	$\hpower \; rs $ & $\dn$ & $\hpa \; rs \; 0$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2907	\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2908	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2909
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2910	Now the intuition that an NTIMES regular expression's power
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2911	does not increase can be easily expressed as
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2912	\begin{lemma}\label{nupdatesMono2}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2913	$\hpower \; (\nupdates \;s \; r \; [\Some \; ([c], n)]) \leq n$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2914	\end{lemma}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2915	\begin{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2916	Note that the power is non-increasing after a $\nupdate$ application:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2917	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2918	$\hpa \;\; (\nupdate \; c \; r \; Ss)\;\; m \leq
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2919	\hpa\; \; Ss \; m$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2920	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2921	This is also the case for $\nupdates$:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2922	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2923	$\hpa \;\; (\nupdates \; s \; r \; Ss)\;\; m \leq
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2924	\hpa\; \; Ss \; m$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2925	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2926	Therefore we have that
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2927	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2928	$\hpower \;\; (\nupdates \; s \; r \; Ss) \leq
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2929	\hpower \;\; Ss$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2930	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2931	which leads to the lemma being proven.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2932
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2933	\end{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2934
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2935
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2936	We also define the inductive rules for
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2937	the shape of derivatives of the NTIMES regular expressions:\\[-3em]
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2938	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2939	\begin{mathpar}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2940	\inferrule{\mbox{}}{\cbn \;\ZERO}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2941
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2942	\inferrule{\mbox{}}{\cbn \; \; r_a \cdot (r^{\{n\}})}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2943
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2944	\inferrule{\cbn \; r_1 \;\; \; \cbn \; r_2}{\cbn \; r_1 + r_2}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2945
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2946	\inferrule{\cbn \; r}{\cbn \; r + \ZERO}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2947	\end{mathpar}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2948	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2949	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2950	A derivative of NTIMES fits into the shape described by $\cbn$:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2951	\begin{lemma}\label{ntimesDersCbn}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2952	$\cbn \; ((r' \cdot r^{\{n\}}) \backslash_{rs} s)$ holds.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2953	\end{lemma}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2954	\begin{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2955	By a reverse induction on $s$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2956	For the inductive case, note that if $\cbn \; r$ holds,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2957	then $\cbn \; (r\backslash_r c)$ holds.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2958	\end{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2959	\noindent
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	2960	In addition, for $\cbn$-shaped regular expressions, one can flatten
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2961	them:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2962	\begin{lemma}\label{ntimesHfauPushin}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2963	If $\cbn \; r$ holds, then $\hflataux{r \backslash_r c} =
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2964	\textit{concat} \; (\map \; \hflataux{\map \; (\_\backslash_r c) \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2965	(\hflataux{r})})$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2966	\end{lemma}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2967	\begin{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2968	By an induction on the inductive cases of $\cbn$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2969	\end{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2970	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2971	This time we do not need to define the flattening functions for NTIMES only,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2972	because $\hflat{\_}$ and $\hflataux{\_}$ work on NTIMES already.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2973	\begin{lemma}\label{ntimesHfauInduct}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2974	$\hflataux{( (r\backslash_r c) \cdot r^{\{n\}}) \backslash_{rsimps} s} =
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2975	\map \; (\opterm \; r) \; (\nupdates \; s \; r \; [\Some \; ([c], n)])$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2976	\end{lemma}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2977	\begin{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2978	By a reverse induction on $s$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2979	The lemmas \ref{ntimesHfauPushin} and \ref{ntimesDersCbn} are used.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2980	\end{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2981	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2982	We have a recursive property for NTIMES with $\nupdate$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2983	similar to that for STAR,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2984	and one for $\nupdates $ as well:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2985	\begin{lemma}\label{nupdateInduct1}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2986	\mbox{}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2987	\begin{itemize}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2988	\item
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2989	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2990	$\textit{concat} \; (\map \; (\hflataux{\_} \circ (
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2991	\opterm \; r)) \; Ss) = \map \; (\opterm \; r) \; (\nupdate \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2992	c \; r \; Ss)$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2993	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2994	holds.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2995	\item
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2996	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2997	$\textit{concat} \; (\map \; \hflataux{\_}\;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2998	\map \; (\_\backslash_r x) \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	2999	(\map \; (\opterm \; r) \; (\nupdates \; xs \; r \; Ss)))$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3000	$=$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3001	$\map \; (\opterm \; r) \; (\nupdates \;(xs@[x]) \; r\;Ss)$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3002	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3003	holds.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3004	\end{itemize}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3005	\end{lemma}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3006	\begin{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3007	(i) is by an induction on $Ss$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3008	(ii) is by an induction on $xs$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3009	\end{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3010	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3011	The $\nString$ predicate is defined for conveniently
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3012	expressing that there are no empty strings in the
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3013	$\Some \;(s, n)$ elements generated by $\nupdate$:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3014	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3015	\begin{tabular}{lcl}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3016	$\nString \; \None$ & $\dn$ & $ \textit{true}$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3017	$\nString \; (\Some \; ([], n))$ & $\dn$ & $ \textit{false}$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3018	$\nString \; (\Some \; (c::s, n))$ & $\dn$ & $ \textit{true}$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3019	\end{tabular}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3020	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3021	\begin{lemma}\label{nupdatesNonempty}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3022	If for all elements $o \in \textit{set} \; Ss$,
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3023	$\nString \; o$ holds, then we have that
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3024	for all elements $o' \in \textit{set} \; (\nupdates \; s \; r \; Ss)$,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3025	$\nString \; o'$ holds.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3026	\end{lemma}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3027	\begin{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3028	By an induction on $s$, where $Ss$ is set to vary over all possible values.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3029	\end{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3030
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3031	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3032
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3033	\begin{lemma}\label{ntimesClosedFormsSteps}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3034	The following list of equalities or rewriting relations hold:\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3035	(i) $r^{\{n+1\}} \backslash_{rsimps} (c::s) =
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3036	\textit{rsimp} \; (\sum (\map \; (\opterm \;r \;\_) \; (\nupdates \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3037	s \; r \; [\Some \; ([c], n)])))$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3038	(ii)
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3039	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3040	$\sum (\map \; (\opterm \; r) \; (\nupdates \; s \; r \; [
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3041	\Some \; ([c], n)]))$ \\ $ \sequal$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3042	$\sum (\map \; (\textit{rsimp} \circ (\opterm \; r))\; (\nupdates \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3043	s\;r \; [\Some \; ([c], n)]))$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3044	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3045	(iii)
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3046	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3047	$\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3048	([c], n)]))$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3049	$\sequal$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3050	$\sum \;(\map \; (\optermsimp r) \; (\nupdates \; s \; r \; [\Some \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3051	([c], n)])) $\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3052	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3053	(iv)
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3054	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3055	$\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3056	([c], n)])) $ \\ $\sequal$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3057	$\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3058	([c], n)])) $\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3059	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3060	(v)
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3061	\begin{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3062	$\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3063	([c], n)])) $ \\ $\sequal$\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3064	$\sum \; (\map \; (\textit{rsimp} \circ (\opterm \; r)) \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3065	(\nupdates \; s \; r \; [\Some \; ([c], n)]))$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3066	\end{center}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3067	\end{lemma}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3068	\begin{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3069	Routine.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3070	(iii) and (iv) make use of the fact that all the strings $s$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3071	inside $\Some \; (s, m)$ which are elements of the list
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3072	$\nupdates \; s\;r\;[\Some\; ([c], n)]$ are non-empty,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3073	which is from lemma \ref{nupdatesNonempty}.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3074	Once the string in $o = \Some \; (s, n)$ is
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3075	nonempty, $\optermsimp \; r \;o$,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3076	$\optermosimp \; r \; o$ and $\optermosimp \; \; o$ are guaranteed
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3077	to be equal.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3078	(v) uses \ref{nupdateInduct1}.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3079	\end{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3080	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3081	Now we are ready to present the closed form for NTIMES:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3082	\begin{theorem}\label{ntimesClosedForm}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3083	The derivative of $r^{\{n+1\}}$ can be described as an alternative
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3084	containing a list
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3085	of terms:\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3086	$r^{\{n+1\}} \backslash_{rsimps} (c::s) = \textit{rsimp} \; (
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3087	\sum (\map \; (\optermsimp \; r) \; (\nupdates \; s \; r \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3088	[\Some \; ([c], n)])))$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3089	\end{theorem}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3090	\begin{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3091	By the rewriting steps described in lemma \ref{ntimesClosedFormsSteps}.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3092	\end{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3093	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3094	The key observation for bounding this closed form
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3095	is that the counter on $r^{\{n\}}$ will
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3096	only decrement during derivatives:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3097	\begin{lemma}\label{nupdatesNLeqN}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3098	For an element $o$ in $\textit{set} \; (\nupdates \; s \; r \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3099	[\Some \; ([c], n)])$, either $o = \None$, or $o = \Some
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3100	\; (s', m)$ for some string $s'$ and number $m \leq n$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3101	\end{lemma}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3102	\noindent
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3103	The proof is routine and therefore omitted.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3104	This allows us to say what kind of terms
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3105	are in the list $\textit{set} \; (\map \; (\optermsimp \; r) \; (
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3106	\nupdates \; s \; r \; [\Some \; ([c], n)]))$:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3107	only $\ZERO_r$s or a sequence with the tail an $r^{\{m\}}$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3108	with a small $m$:
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3109	\begin{lemma}\label{ntimesClosedFormListElemShape}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3110	For any element $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; (
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3111	\nupdates \; s \; r \; [\Some \; ([c], n)]))$,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3112	we have that $r'$ is either $\ZERO$ or $r \backslash_{rsimps} s' \cdot
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3113	r^{\{m\}}$ for some string $s'$ and number $m \leq n$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3114	\end{lemma}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3115	\begin{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3116	Using lemma \ref{nupdatesNLeqN}.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3117	\end{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3118
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3119	\begin{theorem}\label{ntimesClosedFormBounded}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3120	Assuming that for any string $s$, $\llbracket r \backslash_{rsimps} s
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3121	\rrbracket_r \leq N$ holds, then we have that\\
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3122	$\llbracket r^{\{n+1\}} \backslash_{rsimps} s \rrbracket_r \leq
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3123	\textit{max} \; (c_N+1)* (N + \llbracket r^{\{n\}} \rrbracket+1)$,
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3124	where $c_N = \textit{card} \; (\textit{sizeNregex} \; (
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3125	N + \llbracket r^{\{n\}} \rrbracket_r+1))$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3126	\end{theorem}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3127	\begin{proof}
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	3128	We have that for all regular expressions $r'$ in
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	3129	\begin{center}
80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	3130	$\textit{set} \; (\map \; (\optermsimp \; r) \; (
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3131	\nupdates \; s \; r \; [\Some \; ([c], n)]))$,
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	3132	\end{center}
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3133	$r'$'s size is less than or equal to $N + \llbracket r^{\{n\}}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3134	\rrbracket_r + 1$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3135	because $r'$ can only be either a $\ZERO$ or $r \backslash_{rsimps} s' \cdot
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3136	r^{\{m\}}$ for some string $s'$ and number
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3137	$m \leq n$ (lemma \ref{ntimesClosedFormListElemShape}).
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3138	In addition, we know that the list
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3139	$\map \; (\optermsimp \; r) \; (
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3140	\nupdates \; s \; r \; [\Some \; ([c], n)])$'s size is at most
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3141	$c_N = \textit{card} \;
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3142	(\sizeNregex \; ((N + \llbracket r^{\{n\}} \rrbracket) + 1))$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3143	This gives us $\llbracket r \backslash_{rsimps} \;s \rrbracket_r
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3144	\leq N * c_N$.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3145	\end{proof}
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3146
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3147	We aim to formalise the correctness and size bound
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3148	for constructs like $r^{\{\ldots n\}}$, $r^{\{n \ldots\}}$
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3149	and so on, which is still work in progress.
b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3150	They should more or less follow the same recipe described in this section.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3151	Once we know how to deal with them recursively using suitable auxiliary
bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3152	definitions, we can routinely establish the proofs.
625 b797c9a709d9 section reorganising, related work Chengsong parents: 624 diff changeset	3153
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3154
557 812e5d112f49 more changes Chengsong parents: 556 diff changeset	3155	%----------------------------------------------------------------------------------------
812e5d112f49 more changes Chengsong parents: 556 diff changeset	3156	% SECTION 3
812e5d112f49 more changes Chengsong parents: 556 diff changeset	3157	%----------------------------------------------------------------------------------------
812e5d112f49 more changes Chengsong parents: 556 diff changeset	3158
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3159
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3160	\section{Comments and Future Improvements}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3161	\subsection{Some Experimental Results}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3162	What guarantee does this bound give us?
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	3163	It states that whatever the regex is, it will not grow indefinitely.
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3164	Take our previous example $(a + aa)^*$ as an example:
cc54ce075db5 restructured Chengsong parents: diff changeset	3165	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3166	\begin{tabular}{@{}c@{\hspace{0mm}}c@{\hspace{0mm}}c@{}}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3167	\begin{tikzpicture}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3168	\begin{axis}[
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3169	xlabel={number of $a$'s},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3170	x label style={at={(1.05,-0.05)}},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3171	ylabel={regex size},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3172	enlargelimits=false,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3173	xtick={0,5,...,30},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3174	xmax=33,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3175	ymax= 40,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3176	ytick={0,10,...,40},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3177	scaled ticks=false,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3178	axis lines=left,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3179	width=5cm,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3180	height=4cm,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3181	legend entries={$(a + aa)^*$},
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3182	legend pos=south east,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3183	legend cell align=left]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3184	\addplot[red,mark=*, mark options={fill=white}] table {a_aa_star.data};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3185	\end{axis}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3186	\end{tikzpicture}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3187	\end{tabular}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3188	\end{center}
cc54ce075db5 restructured Chengsong parents: diff changeset	3189	We are able to limit the size of the regex $(a + aa)^*$'s derivatives
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3190	with our simplification
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3191	rules very effectively.
cc54ce075db5 restructured Chengsong parents: diff changeset	3192
cc54ce075db5 restructured Chengsong parents: diff changeset	3193
cc54ce075db5 restructured Chengsong parents: diff changeset	3194	In our proof for the inductive case $r_1 \cdot r_2$, the dominant term in the bound
cc54ce075db5 restructured Chengsong parents: diff changeset	3195	is $l_{N_2} * N_2$, where $N_2$ is the bound we have for $\llbracket \bderssimp{r_2}{s} \rrbracket$.
cc54ce075db5 restructured Chengsong parents: diff changeset	3196	Given that $l_{N_2}$ is roughly the size $4^{N_2}$, the size bound $\llbracket \bderssimp{r_1 \cdot r_2}{s} \rrbracket$
cc54ce075db5 restructured Chengsong parents: diff changeset	3197	inflates the size bound of $\llbracket \bderssimp{r_2}{s} \rrbracket$ with the function
cc54ce075db5 restructured Chengsong parents: diff changeset	3198	$f(x) = x * 2^x$.
cc54ce075db5 restructured Chengsong parents: diff changeset	3199	This means the bound we have will surge up at least
cc54ce075db5 restructured Chengsong parents: diff changeset	3200	tower-exponentially with a linear increase of the depth.
cc54ce075db5 restructured Chengsong parents: diff changeset	3201
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3202	One might be pretty skepticafl about what this non-elementary
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3203	bound can bring us.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3204	It turns out that the giant bounds are far from being hit.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3205	Here we have some test data from randomly generated regular expressions:
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3206	\begin{figure}[H]
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3207	\begin{tabular}{@{}c@{\hspace{2mm}}c@{\hspace{0mm}}c@{}}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3208	\begin{tikzpicture}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3209	\begin{axis}[
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3210	xlabel={$n$},
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3211	x label style={at={(1.05,-0.05)}},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3212	ylabel={regex size},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3213	enlargelimits=false,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3214	xtick={0,5,...,30},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3215	xmax=33,
611 bc1df466150a more Chengsong parents: 610 diff changeset	3216	%ymax=1000,
bc1df466150a more Chengsong parents: 610 diff changeset	3217	%ytick={0,100,...,1000},
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3218	scaled ticks=false,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3219	axis lines=left,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3220	width=4.75cm,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3221	height=3.8cm,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3222	legend entries={regex1},
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3223	legend pos=north east,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3224	legend cell align=left]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3225	\addplot[red,mark=*, mark options={fill=white}] table {regex1_size_change.data};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3226	\end{axis}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3227	\end{tikzpicture}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3228	&
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3229	\begin{tikzpicture}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3230	\begin{axis}[
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3231	xlabel={$n$},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3232	x label style={at={(1.05,-0.05)}},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3233	%ylabel={time in secs},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3234	enlargelimits=false,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3235	xtick={0,5,...,30},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3236	xmax=33,
611 bc1df466150a more Chengsong parents: 610 diff changeset	3237	%ymax=1000,
bc1df466150a more Chengsong parents: 610 diff changeset	3238	%ytick={0,100,...,1000},
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3239	scaled ticks=false,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3240	axis lines=left,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3241	width=4.75cm,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3242	height=3.8cm,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3243	legend entries={regex2},
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3244	legend pos=south east,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3245	legend cell align=left]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3246	\addplot[blue,mark=*, mark options={fill=white}] table {regex2_size_change.data};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3247	\end{axis}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3248	\end{tikzpicture}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3249	&
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3250	\begin{tikzpicture}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3251	\begin{axis}[
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3252	xlabel={$n$},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3253	x label style={at={(1.05,-0.05)}},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3254	%ylabel={time in secs},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3255	enlargelimits=false,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3256	xtick={0,5,...,30},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3257	xmax=33,
611 bc1df466150a more Chengsong parents: 610 diff changeset	3258	%ymax=1000,
bc1df466150a more Chengsong parents: 610 diff changeset	3259	%ytick={0,100,...,1000},
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3260	scaled ticks=false,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3261	axis lines=left,
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3262	width=4.75cm,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3263	height=3.8cm,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3264	legend entries={regex3},
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3265	legend pos=south east,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3266	legend cell align=left]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3267	\addplot[cyan,mark=*, mark options={fill=white}] table {regex3_size_change.data};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3268	\end{axis}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3269	\end{tikzpicture}\\
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3270	\multicolumn{3}{c}{}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3271	\end{tabular}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3272	\caption{Graphs: size change of 3 randomly generated
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3273	regular expressions $w.r.t.$ input string length.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3274	The x-axis represents the length of the input.}
611 bc1df466150a more Chengsong parents: 610 diff changeset	3275	\end{figure}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3276	\noindent
cc54ce075db5 restructured Chengsong parents: diff changeset	3277	Most of the regex's sizes seem to stay within a polynomial bound $w.r.t$ the
cc54ce075db5 restructured Chengsong parents: diff changeset	3278	original size.
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	3279	We will discuss improvements to this bound in the next chapter.
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3280
cc54ce075db5 restructured Chengsong parents: diff changeset	3281
cc54ce075db5 restructured Chengsong parents: diff changeset	3282
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3283	\subsection{Possible Further Improvements}
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	3284	There are two problems with this finiteness result, though:\\
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3285	(i)
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3286	First, it is not yet a direct formalisation of our lexer's complexity,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3287	as a complexity proof would require looking into
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3288	the time it takes to execute {\bf all} the operations
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3289	involved in the lexer (simp, collect, decode), not just the derivative.\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3290	(ii)
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3291	Second, the bound is not yet tight, and we seek to improve $N_a$ so that
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3292	it is polynomial on $\llbracket a \rrbracket$.\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3293	Still, we believe this contribution is useful,
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3294	because
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3295	\begin{itemize}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3296	\item
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3297
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3298	The size proof can serve as a starting point for a complexity
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3299	formalisation.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3300	Derivatives are the most important phases of our lexer algorithm.
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3301	Size properties about derivatives cover the majority of the algorithm
bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3302	and is therefore a good indication of the complexity of the entire program.
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3303	\item
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3304	The bound is already a strong indication that catastrophic
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3305	backtracking is much less likely to occur in our $\blexersimp$
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3306	algorithm.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3307	We refine $\blexersimp$ with $\blexerStrong$ in the next chapter
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	3308	so that we conjecture the bound becomes polynomial.
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 577 diff changeset	3309	\end{itemize}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3310
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3311	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	3312	% SECTION 4
cc54ce075db5 restructured Chengsong parents: diff changeset	3313	%----------------------------------------------------------------------------------------
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3314
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3315
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3316
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3317
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3318
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3319
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3320
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3321
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3322	One might wonder about the actual bound rather than the loose bound we gave
bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3323	for the convenience of a more straightforward proof.
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3324	How much can the regex $r^* \backslash s$ grow?
cc54ce075db5 restructured Chengsong parents: diff changeset	3325	As earlier graphs have shown,
cc54ce075db5 restructured Chengsong parents: diff changeset	3326	%TODO: reference that graph where size grows quickly
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3327	they can grow at a maximum speed
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3328	exponential $w.r.t$ the number of characters,
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3329	but will eventually level off when the string $s$ is long enough.
cc54ce075db5 restructured Chengsong parents: diff changeset	3330	If they grow to a size exponential $w.r.t$ the original regex, our algorithm
cc54ce075db5 restructured Chengsong parents: diff changeset	3331	would still be slow.
cc54ce075db5 restructured Chengsong parents: diff changeset	3332	And unfortunately, we have concrete examples
576 3e1b699696b6 thesis chap5 Chengsong parents: 564 diff changeset	3333	where such regular expressions grew exponentially large before levelling off:
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3334	\begin{center}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3335	$(a ^ * + (aa) ^ * + (aaa) ^ * + \ldots +
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3336	(\underbrace{a \ldots a}_{\text{n a's}})^)^$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3337	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3338	will already have a maximum
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3339	size that is exponential on the number $n$
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3340	under our current simplification rules:
cc54ce075db5 restructured Chengsong parents: diff changeset	3341	%TODO: graph of a regex whose size increases exponentially.
cc54ce075db5 restructured Chengsong parents: diff changeset	3342	\begin{center}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3343	\begin{tikzpicture}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3344	\begin{axis}[
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3345	height=0.5\textwidth,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3346	width=\textwidth,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3347	xlabel=number of a's,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3348	xtick={0,...,9},
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3349	ylabel=maximum size,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3350	ymode=log,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3351	log basis y={2}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3352	]
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3353	\addplot[mark=*,blue] table {re-chengsong.data};
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3354	\end{axis}
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3355	\end{tikzpicture}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3356	\end{center}
cc54ce075db5 restructured Chengsong parents: diff changeset	3357
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3358	For convenience we use $(\sum_{i=1}^{n} (\underbrace{a \ldots a}_{\text{i a's}})^)^$
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3359	to express $(a ^ * + (aa) ^ * + (aaa) ^ * + \ldots +
cc54ce075db5 restructured Chengsong parents: diff changeset	3360	(\underbrace{a \ldots a}_{\text{n a's}})^*$ in the below discussion.
cc54ce075db5 restructured Chengsong parents: diff changeset	3361	The exponential size is triggered by that the regex
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3362	$\sum_{i=1}^{n} (\underbrace{a \ldots a}_{\text{i a's}})^*$
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3363	inside the $(\ldots) ^*$ having exponentially many
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3364	different derivatives, despite those differences being minor.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3365	$(\sum_{i=1}^{n} (\underbrace{a \ldots a}_{\text{i a's}})^)^\backslash \underbrace{a \ldots a}_{\text{m a's}}$
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3366	will therefore contain the following terms (after flattening out all nested
cc54ce075db5 restructured Chengsong parents: diff changeset	3367	alternatives):
cc54ce075db5 restructured Chengsong parents: diff changeset	3368	\begin{center}
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3369	$(\sum_{i = 1}^{n} (\underbrace{a \ldots a}_{\text{((i - (m' \% i))\%i) a's}})\cdot (\underbrace{a \ldots a}_{\text{i a's}})^* )\cdot (\sum_{i=1}^{n} (\underbrace{a \ldots a}_{\text{i a's}})^*)$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3370	[1mm]
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3371	$(1 \leq m' \leq m )$
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3372	\end{center}
639 80cc6dc4c98b until chap 7 Chengsong parents: 638 diff changeset	3373	There are at least exponentially
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3374	many such terms.\footnote{To be exact, these terms are
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3375	distinct for $m' \leq L.C.M.(1, \ldots, n)$, the details are omitted,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3376	but the point is that the number is exponential.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3377	}
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3378	With each new input character taking the derivative against the intermediate result, more and more such distinct
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3379	terms will accumulate.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3380	The function $\textit{distinctBy}$ will not be able to de-duplicate any two of these terms
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3381	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3382	$(\sum_{i = 1}^{n}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3383	(\underbrace{a \ldots a}_{\text{((i - (m' \% i))\%i) a's}})\cdot
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3384	(\underbrace{a \ldots a}_{\text{i a's}})^* )\cdot
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3385	(\sum_{i=1}^{n} (\underbrace{a \ldots a}_{\text{i a's}})^)^$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3386	$(\sum_{i = 1}^{n} (\underbrace{a \ldots a}_{\text{((i - (m'' \% i))\%i) a's}})\cdot
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3387	(\underbrace{a \ldots a}_{\text{i a's}})^* )\cdot
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3388	(\sum_{i=1}^{n} (\underbrace{a \ldots a}_{\text{i a's}})^)^$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3389	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3390	\noindent
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3391	where $m' \neq m''$
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3392	as they are slightly different.
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3393	This means that with our current simplification methods,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3394	we will not be able to control the derivative so that
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3395	$\llbracket \bderssimp{r}{s} \rrbracket$ stays polynomial. %\leq O((\llbracket r\rrbacket)^c)$
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3396	These terms are similar in the sense that the head of those terms
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3397	are all consisted of sub-terms of the form:
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3398	$(\underbrace{a \ldots a}_{\text{j a's}})\cdot (\underbrace{a \ldots a}_{\text{i a's}})^* $.
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3399	For $\sum_{i=1}^{n} (\underbrace{a \ldots a}_{\text{i a's}})^*$, there will be at most
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3400	$n * (n + 1) / 2$ such terms.
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3401	For example, $(a^* + (aa)^* + (aaa)^) ^$'s derivatives
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3402	can be described by 6 terms:
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3403	$a^$, $a\cdot (aa)^$, $ (aa)^*$,
83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3404	$aa \cdot (aaa)^$, $a \cdot (aaa)^$, and $(aaa)^*$.
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3405	The total number of different "head terms", $n * (n + 1) / 2$,
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3406	is proportional to the number of characters in the regex
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3407	$(\sum_{i=1}^{n} (\underbrace{a \ldots a}_{\text{i a's}})^)^$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3408	If we can improve our deduplication process so that it becomes smarter
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3409	and only keep track of these $n * (n+1) /2$ terms, then we can keep
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3410	the size growth polynomial again.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3411	This example also suggests a slightly different notion of size, which we call the
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3412	alphabetic width:
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3413	\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3414	\begin{tabular}{lcl}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3415	$\textit{awidth} \; \ZERO$ & $\dn$ & $0$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3416	$\textit{awidth} \; \ONE$ & $\dn$ & $0$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3417	$\textit{awidth} \; c$ & $\dn$ & $1$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3418	$\textit{awidth} \; r_1 + r_2$ & $\dn$ & $\textit{awidth} \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3419	r_1 + \textit{awidth} \; r_2$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3420	$\textit{awidth} \; r_1 \cdot r_2$ & $\dn$ & $\textit{awidth} \;
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3421	r_1 + \textit{awidth} \; r_2$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3422	$\textit{awidth} \; r^*$ & $\dn$ & $\textit{awidth} \; r$\\
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3423	\end{tabular}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3424	\end{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3425
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3426
593 83fab852d72d more chap5 Chengsong parents: 591 diff changeset	3427
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3428	Antimirov\parencite{Antimirov95} has proven that
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3429	$\textit{PDER}_{UNIV}(r) \leq \textit{awidth}(r)$,
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3430	where $\textit{PDER}_{UNIV}(r)$ is a set of all possible subterms
cc54ce075db5 restructured Chengsong parents: diff changeset	3431	created by doing derivatives of $r$ against all possible strings.
cc54ce075db5 restructured Chengsong parents: diff changeset	3432	If we can make sure that at any moment in our lexing algorithm our
cc54ce075db5 restructured Chengsong parents: diff changeset	3433	intermediate result hold at most one copy of each of the
cc54ce075db5 restructured Chengsong parents: diff changeset	3434	subterms then we can get the same bound as Antimirov's.
cc54ce075db5 restructured Chengsong parents: diff changeset	3435	This leads to the algorithm in the next chapter.
cc54ce075db5 restructured Chengsong parents: diff changeset	3436
cc54ce075db5 restructured Chengsong parents: diff changeset	3437
cc54ce075db5 restructured Chengsong parents: diff changeset	3438
cc54ce075db5 restructured Chengsong parents: diff changeset	3439
cc54ce075db5 restructured Chengsong parents: diff changeset	3440
cc54ce075db5 restructured Chengsong parents: diff changeset	3441	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	3442	% SECTION 1
cc54ce075db5 restructured Chengsong parents: diff changeset	3443	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	3444
cc54ce075db5 restructured Chengsong parents: diff changeset	3445
cc54ce075db5 restructured Chengsong parents: diff changeset	3446	%-----------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	3447	% SUBSECTION 1
cc54ce075db5 restructured Chengsong parents: diff changeset	3448	%-----------------------------------
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3449	%\subsection{Syntactic Equivalence Under $\simp$}
640 bd1354127574 more proofreading done, last version before submission Chengsong parents: 639 diff changeset	3450	%We prove that minor differences can be annihilated
618 233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3451	%by $\simp$.
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3452	%For example,
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3453	%\begin{center}
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3454	% $\simp \;(\simpALTs\; (\map \;(\_\backslash \; x)\; (\distinct \; \mathit{rs}\; \phi))) =
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3455	% \simp \;(\simpALTs \;(\distinct \;(\map \;(\_ \backslash\; x) \; \mathit{rs}) \; \phi))$
233cf2b97d1a chapter 5 finished!! Chengsong parents: 614 diff changeset	3456	%\end{center}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	3457

author	Chengsong
	Mon, 10 Jul 2023 00:44:45 +0100
changeset 659	2e05f04ed6b3
parent 640	bd1354127574
child 660	eddc4eaba7c4
permissions	-rwxr-xr-x