lexing: ChengsongTanPhdThesis/Chapters/Cubic.tex@17c7611fb0a9 (annotated)

532 cc54ce075db5 restructured Chengsong parents: diff changeset	1	% Chapter Template
cc54ce075db5 restructured Chengsong parents: diff changeset	2
cc54ce075db5 restructured Chengsong parents: diff changeset	3	\chapter{A Better Bound and Other Extensions} % Main chapter title
cc54ce075db5 restructured Chengsong parents: diff changeset	4
cc54ce075db5 restructured Chengsong parents: diff changeset	5	\label{Cubic} %In Chapter 5\ref{Chapter5} we discuss stronger simplifications to improve the finite bound
cc54ce075db5 restructured Chengsong parents: diff changeset	6	%in Chapter 4 to a polynomial one, and demonstrate how one can extend the
cc54ce075db5 restructured Chengsong parents: diff changeset	7	%algorithm to include constructs such as bounded repetitions and negations.
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	8	\lstset{style=myScalastyle}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	9
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	10
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	11	This chapter is a ``miscellaneous''
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	12	chapter which records various
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	13	extensions to our $\blexersimp$'s formalisations.\\
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	14	Firstly we present further improvements
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	15	made to our lexer algorithm $\blexersimp$.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	16	We devise a stronger simplification algorithm,
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	17	called $\bsimpStrong$, which can prune away
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	18	similar components in two regular expressions at the same
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	19	alternative level,
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	20	even if these regular expressions are not exactly the same.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	21	We call the lexer that uses this stronger simplification function
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	22	$\blexerStrong$.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	23	We conjecture that both
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	24	\begin{center}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	25	$\blexerStrong \;r \; s = \blexer\; r\;s$
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	26	\end{center}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	27	and
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	28	\begin{center}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	29	$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	30	\end{center}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	31	hold, but formalising
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	32	them is still work in progress.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	33	We give reasons why the correctness and cubic size bound proofs
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	34	can be achieved,
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	35	by exploring the connection between the internal
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	36	data structure of our $\blexerStrong$ and
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	37	Animirov's partial derivatives.\\
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	38	Secondly, we extend our $\blexersimp$
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	39	to support bounded repetitions ($r^{\{n\}}$).
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	40	We update our formalisation of
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	41	the correctness and finiteness properties to
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	42	include this new construct.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	43	we can out-compete other verified lexers such as
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	44	Verbatim++ on bounded regular expressions.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	45	%We also present the idempotency property proof
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	46	%of $\bsimp$, which leverages the idempotency proof of $\rsimp$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	47	%This reinforces our claim that the fixpoint construction
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	48	%originally required by Sulzmann and Lu can be removed in $\blexersimp$.
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	49
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	50	%Last but not least, we present our efforts and challenges we met
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	51	%in further improving the algorithm by data
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	52	%structures such as zippers.
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	53
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	54
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	55
532 cc54ce075db5 restructured Chengsong parents: diff changeset	56	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	57	% SECTION strongsimp
cc54ce075db5 restructured Chengsong parents: diff changeset	58	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	59	\section{A Stronger Version of Simplification}
cc54ce075db5 restructured Chengsong parents: diff changeset	60	%TODO: search for isabelle proofs of algorithms that check equivalence
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	61	In our bitcoded lexing algorithm, (sub)terms represent (sub)matches.
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	62	For example, the regular expression
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	63	\[
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	64	aa \cdot a^+ a \cdot a^ + aa\cdot a^*
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	65	\]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	66	contains three terms,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	67	expressing three possibilities it will match future input.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	68	The first and the third terms are identical, which means we can eliminate
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	69	the latter as we know it will not be picked up by $\bmkeps$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	70	In $\bsimps$, the $\distinctBy$ function takes care of this.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	71	The criteria $\distinctBy$ uses for removing a duplicate
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	72	$a_2$ in the list
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	73	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	74	$rs_a@[a_1]@rs_b@[a_2]@rs_c$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	75	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	76	is that
533 6acbc939af6a more Chengsong parents: 532 diff changeset	77	\begin{center}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	78	$\rerase{a_1} = \rerase{a_2}$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	79	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	80	It can be characterised as the $LD$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	81	rewrite rule in \ref{rrewriteRules}.\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	82	The problem , however, is that identical components
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	83	in two slightly different regular expressions cannot be removed:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	84	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	85	\[
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	86	(a+b+d) \cdot r_1 + (a+c+e) \cdot r_1 \stackrel{?}{\rightsquigarrow} (a+b+d) \cdot r_1 + (c+e) \cdot r_1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	87	\]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	88	\caption{Desired simplification, but not done in $\blexersimp$}\label{partialDedup}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	89	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	90	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	91	A simplification like this actually
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	92	cannot be omitted,
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	93	as without it the size could blow up even with our $\textit{bsimp}$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	94	function: for the chapter \ref{Finite} example
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	95	$\protect((a^* + (aa)^* + \ldots + (\underbrace{a\ldots a}_{n a's})^* )^)^$,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	96	by just setting n to a small number,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	97	we get exponential growth that does not stop before it becomes huge:
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	98	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	99	\centering
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	100	\begin{tikzpicture}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	101	\begin{axis}[
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	102	%xlabel={$n$},
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	103	myplotstyle,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	104	xlabel={input length},
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	105	ylabel={size},
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	106	]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	107	\addplot[blue,mark=*, mark options={fill=white}] table {bsimpExponential.data};
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	108	\end{axis}
533 6acbc939af6a more Chengsong parents: 532 diff changeset	109	\end{tikzpicture}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	110	\caption{Runtime of $\blexersimp$ for matching
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	111	$\protect((a^* + (aa)^* + \ldots + (aaaaa)^* )^)^$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	112	with strings
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	113	of the form $\protect\underbrace{aa..a}_{n}$.}\label{blexerExp}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	114	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	115	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	116	We would like to apply the rewriting at some stage
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	117	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	118	\[
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	119	(a+b+d) \cdot r_1 \longrightarrow a \cdot r_1 + b \cdot r_1 + d \cdot r_1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	120	\]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	121	\caption{Desired simplification, but not done in $\blexersimp$}\label{desiredSimp}
533 6acbc939af6a more Chengsong parents: 532 diff changeset	122	\end{figure}
6acbc939af6a more Chengsong parents: 532 diff changeset	123	\noindent
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	124	in our $\simp$ function,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	125	so that it makes the simplification in \ref{partialDedup} possible.
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	126	Translating the rule into our $\textit{bsimp}$ function simply
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	127	involves adding a new clause to the $\textit{bsimp}_{ASEQ}$ function:
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	128	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	129	\begin{tabular}{@{}lcl@{}}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	130	$\textit{bsimp}_{ASEQ} \; bs\; a \; b$ & $\dn$ & $ (a,\; b) \textit{match}$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	131	&& $\ldots$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	132	&&$\quad\textit{case} \; (_{bs1}\sum as, a_2') \Rightarrow _{bs1}\sum (
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	133	\map \; (_{[]}\textit{ASEQ} \; \_ \; a_2') \; as)$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	134	&&$\quad\textit{case} \; (a_1', a_2') \Rightarrow _{bs}a_1' \cdot a_2'$ \\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	135	\end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	136	\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	137
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	138
533 6acbc939af6a more Chengsong parents: 532 diff changeset	139
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	140	Unfortunately,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	141	if we introduce them in our
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	142	setting we would lose the POSIX property of our calculated values.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	143	For example given the regular expression
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	144	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	145	$(a + ab)(bc + c)$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	146	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	147	and the string
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	148	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	149	$ab$,
533 6acbc939af6a more Chengsong parents: 532 diff changeset	150	\end{center}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	151	then our algorithm generates the following
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	152	correct POSIX value
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	153	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	154	$\Seq \; (\Right \; ab) \; (\Right \; c)$.
533 6acbc939af6a more Chengsong parents: 532 diff changeset	155	\end{center}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	156	Essentially it matches the string with the longer Right-alternative
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	157	in the first sequence (and
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	158	then the 'rest' with the character regular expression $c$ from the second sequence).
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	159	If we add the simplification above, then we obtain the following value
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	160	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	161	$\Left \; (\Seq \; a \; (\Left \; bc))$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	162	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	163	where the $\Left$-alternatives get priority.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	164	However this violates the POSIX rules.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	165	The reason for getting this undesired value
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	166	is that the new rule splits this regular expression up into
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	167	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	168	$a\cdot(b c + c) + ab \cdot (bc + c)$,
533 6acbc939af6a more Chengsong parents: 532 diff changeset	169	\end{center}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	170	which becomes a regular expression with a
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	171	totally different structure--the original
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	172	was a sequence, and now it becomes an alternative.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	173	With an alternative the maximum munch rule no longer works.\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	174	A method to reconcile this is to do the
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	175	transformation in \ref{desiredSimp} ``non-invasively'',
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	176	meaning that we traverse the list of regular expressions
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	177	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	178	$rs_a@[a]@rs_c$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	179	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	180	in the alternative
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	181	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	182	$\sum ( rs_a@[a]@rs_c)$
533 6acbc939af6a more Chengsong parents: 532 diff changeset	183	\end{center}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	184	using a function similar to $\distinctBy$,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	185	but this time
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	186	we allow a more general list rewrite:
592 7f4c353c0f6b more Chengsong parents: 591 diff changeset	187	\begin{mathpar}\label{cubicRule}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	188	\inferrule * [Right = cubicRule]{\vspace{0mm} }{rs_a@[a]@rs_c
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	189	\stackrel{s}{\rightsquigarrow }
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	190	rs_a@[\textit{prune} \; a \; rs_a]@rs_c }
592 7f4c353c0f6b more Chengsong parents: 591 diff changeset	191	\end{mathpar}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	192	%L \; a_1' = L \; a_1 \setminus (\cup_{a \in rs_a} L \; a)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	193	where $\textit{prune} \;a \; acc$ traverses $a$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	194	without altering the structure of $a$, removing components in $a$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	195	that have appeared in the accumulator $acc$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	196	For example
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	197	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	198	$\textit{prune} \;\;\; (r_a+r_f+r_g+r_h)r_d \;\; \; [(r_a+r_b+r_c)r_d, (r_e+r_f)r_d] $
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	199	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	200	should be equal to
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	201	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	202	$(r_g+r_h)r_d$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	203	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	204	because $r_gr_d$ and
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	205	$r_hr_d$ are the only terms
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	206	that have not appeared in the accumulator list
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	207	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	208	$[(r_a+r_b+r_c)r_d, (r_e+r_f)r_d]$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	209	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	210	We implemented
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	211	function $\textit{prune}$ in Scala,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	212	and incorporated into our lexer,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	213	by replacing the $\simp$ function
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	214	with a stronger version called $\bsimpStrong$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	215	that prunes regular expressions.
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	216	\begin{figure}[H]
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	217
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	218	\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	219	def atMostEmpty(r: Rexp) : Boolean = r match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	220	case ZERO => true
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	221	case ONE => true
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	222	case STAR(r) => atMostEmpty(r)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	223	case SEQ(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	224	case ALTS(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	225	case CHAR(_) => false
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	226	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	227
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	228
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	229	def isOne(r: Rexp) : Boolean = r match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	230	case ONE => true
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	231	case SEQ(r1, r2) => isOne(r1) && isOne(r2)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	232	case ALTS(r1, r2) => (isOne(r1) \|\| isOne(r2)) && (atMostEmpty(r1) && atMostEmpty(r2))//rs.forall(atMostEmpty) && rs.exists(isOne)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	233	case STAR(r0) => atMostEmpty(r0)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	234	case CHAR(c) => false
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	235	case ZERO => false
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	236	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	237
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	238	//r = r' ~ tail' : If tail' matches tail => returns r'
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	239	def removeSeqTail(r: Rexp, tail: Rexp) : Rexp = r match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	240	case SEQ(r1, r2) =>
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	241	if(r2 == tail)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	242	r1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	243	else
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	244	ZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	245	case r => ZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	246	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	247
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	248	def prune(r: ARexp, acc: Set[Rexp]) : ARexp = r match{
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	249	case AALTS(bs, rs) => rs.map(r => prune(r, acc)).filter(_ != ZERO) match
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	250	{
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	251	//all components have been removed, meaning this is effectively a duplicate
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	252	//flats will take care of removing this AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	253	case Nil => AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	254	case r::Nil => fuse(bs, r)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	255	case rs1 => AALTS(bs, rs1)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	256	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	257	case ASEQ(bs, r1, r2) =>
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	258	//remove the r2 in (ra + rb)r2 to identify the duplicate contents of r1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	259	prune(r1, acc.map(r => removeSeqTail(r, erase(r2)))) match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	260	//after pruning, returns 0
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	261	case AZERO => AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	262	//after pruning, got r1'.r2, where r1' is equal to 1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	263	case r1p if(isOne(erase(r1p))) => fuse(bs ++ mkepsBC(r1p), r2)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	264	//assemble the pruned head r1p with r2
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	265	case r1p => ASEQ(bs, r1p, r2)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	266	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	267	//this does the duplicate component removal task
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	268	case r => if(acc(erase(r))) AZERO else r
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	269	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	270	\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	271	\caption{pruning function together with its helper functions}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	272	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	273	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	274	The benefits of using
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	275	$\textit{prune}$ such as refining the finiteness bound
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	276	to a cubic bound has not been formalised yet.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	277	Therefore we choose to use Scala code rather than an Isabelle-style formal
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	278	definition like we did for $\simp$, as the definitions might change
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	279	to suit proof needs.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	280	In the rest of the chapter we will use this convention consistently.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	281	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	282	\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	283	def distinctWith(rs: List[ARexp],
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	284	pruneFunction: (ARexp, Set[Rexp]) => ARexp,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	285	acc: Set[Rexp] = Set()) : List[ARexp] =
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	286	rs match{
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	287	case Nil => Nil
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	288	case r :: rs =>
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	289	if(acc(erase(r)))
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	290	distinctWith(rs, pruneFunction, acc)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	291	else {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	292	val pruned_r = pruneFunction(r, acc)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	293	pruned_r ::
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	294	distinctWith(rs,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	295	pruneFunction,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	296	turnIntoTerms(erase(pruned_r)) ++: acc
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	297	)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	298	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	299	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	300	\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	301	\caption{A Stronger Version of $\textit{distinctBy}$}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	302	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	303	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	304	The function $\textit{prune}$ is used in $\distinctWith$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	305	$\distinctWith$ is a stronger version of $\distinctBy$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	306	which not only removes duplicates as $\distinctBy$ would
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	307	do, but also uses the $\textit{pruneFunction}$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	308	argument to prune away verbose components in a regular expression.\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	309	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	310	\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	311	//a stronger version of simp
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	312	def bsimpStrong(r: ARexp): ARexp =
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	313	{
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	314	r match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	315	case ASEQ(bs1, r1, r2) => (bsimpStrong(r1), bsimpStrong(r2)) match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	316	//normal clauses same as simp
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	317	case (AZERO, _) => AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	318	case (_, AZERO) => AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	319	case (AONE(bs2), r2s) => fuse(bs1 ++ bs2, r2s)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	320	//bs2 can be discarded
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	321	case (r1s, AONE(bs2)) => fuse(bs1, r1s) //assert bs2 == Nil
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	322	case (r1s, r2s) => ASEQ(bs1, r1s, r2s)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	323	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	324	case AALTS(bs1, rs) => {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	325	//distinctBy(flat_res, erase)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	326	distinctWith(flats(rs.map(bsimpStrong(_))), prune) match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	327	case Nil => AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	328	case s :: Nil => fuse(bs1, s)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	329	case rs => AALTS(bs1, rs)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	330	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	331	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	332	//stars that can be treated as 1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	333	case ASTAR(bs, r0) if(atMostEmpty(erase(r0))) => AONE(bs)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	334	case r => r
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	335	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	336	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	337	\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	338	\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	339	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	340	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	341	$\distinctWith$, is in turn used in $\bsimpStrong$:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	342	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	343	\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	344	//Conjecture: [\| bdersStrong(s, r) \|] = O([\| r \|]^3)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	345	def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	346	case Nil => r
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	347	case c::s => bdersStrong(s, bsimpStrong(bder(c, r)))
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	348	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	349	\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	350	\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	351	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	352	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	353	We conjecture that the above Scala function $\bdersStrongs$,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	354	written $\bdersStrong{\_}{\_}$ as an infix notation,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	355	satisfies the following property:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	356	\begin{conjecture}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	357	$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	358	\end{conjecture}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	359	The stronger version of $\blexersimp$'s
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	360	code in Scala looks like:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	361	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	362	\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	363	def strongBlexer(r: Rexp, s: String) : Option[Val] = {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	364	Try(Some(decode(r, strong_blex_simp(internalise(r), s.toList)))).getOrElse(None)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	365	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	366	def strong_blex_simp(r: ARexp, s: List[Char]) : Bits = s match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	367	case Nil => {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	368	if (bnullable(r)) {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	369	mkepsBC(r)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	370	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	371	else
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	372	throw new Exception("Not matched")
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	373	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	374	case c::cs => {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	375	strong_blex_simp(strongBsimp(bder(c, r)), cs)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	376	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	377	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	378	\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	379	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	380	\noindent
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	381	We call this lexer $\blexerStrong$.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	382	$\blexerStrong$ is able to drastically reduce the
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	383	internal data structure size which could
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	384	trigger exponential behaviours in
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	385	$\blexersimp$.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	386	\begin{figure}[H]
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	387	\centering
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	388	\begin{tabular}{@{}c@{\hspace{0mm}}c@{\hspace{0mm}}c@{}}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	389	\begin{tikzpicture}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	390	\begin{axis}[
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	391	%xlabel={$n$},
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	392	myplotstyle,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	393	xlabel={input length},
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	394	ylabel={size},
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	395	width = 7cm,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	396	height = 5cm,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	397	]
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	398	\addplot[red,mark=*, mark options={fill=white}] table {strongSimpCurve.data};
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	399	\end{axis}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	400	\end{tikzpicture}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	401	&
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	402	\begin{tikzpicture}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	403	\begin{axis}[
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	404	%xlabel={$n$},
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	405	myplotstyle,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	406	xlabel={input length},
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	407	ylabel={size},
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	408	width = 7cm,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	409	height = 5cm,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	410	]
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	411	\addplot[blue,mark=*, mark options={fill=white}] table {bsimpExponential.data};
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	412	\end{axis}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	413	\end{tikzpicture}\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	414	\multicolumn{2}{l}{}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	415	\end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	416	\caption{Runtime for matching
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	417	$\protect((a^* + (aa)^* + \ldots + (aaaaa)^* )^)^$ with strings
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	418	of the form $\protect\underbrace{aa..a}_{n}$.}\label{fig:aaaaaStarStar}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	419	\end{figure}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	420	\noindent
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	421	We would like to preserve the correctness like the one
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	422	we had for $\blexersimp$:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	423	\begin{conjecture}\label{cubicConjecture}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	424	$\blexerStrong \;r \; s = \blexer\; r\;s$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	425	\end{conjecture}
592 7f4c353c0f6b more Chengsong parents: 591 diff changeset	426	\noindent
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	427	The idea is to maintain key lemmas in
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	428	chapter \ref{Bitcoded2} like
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	429	$r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	430	with the new rewriting rule \ref{cubicRule} .
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	431
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	432	In the next sub-section,
592 7f4c353c0f6b more Chengsong parents: 591 diff changeset	433	we will describe why we
7f4c353c0f6b more Chengsong parents: 591 diff changeset	434	believe a cubic bound can be achieved.
7f4c353c0f6b more Chengsong parents: 591 diff changeset	435	We give an introduction to the
7f4c353c0f6b more Chengsong parents: 591 diff changeset	436	partial derivatives,
7f4c353c0f6b more Chengsong parents: 591 diff changeset	437	which was invented by Antimirov \cite{Antimirov95},
7f4c353c0f6b more Chengsong parents: 591 diff changeset	438	and then link it with the result of the function
7f4c353c0f6b more Chengsong parents: 591 diff changeset	439	$\bdersStrongs$.
7f4c353c0f6b more Chengsong parents: 591 diff changeset	440
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	441	\subsection{Antimirov's partial derivatives}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	442	Partial derivatives were first introduced by
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	443	Antimirov \cite{Antimirov95}.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	444	It does derivatives in a similar way as suggested by Brzozowski,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	445	but splits children of alternative regular expressions into
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	446	multiple independent terms, causing the output to become a
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	447	set of regular expressions:
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	448	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	449	\begin{tabular}{lcl}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	450	$\partial_x \; (a \cdot b)$ &
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	451	$\dn$ & $\partial_x \; a\cdot b \cup
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	452	\partial_x \; b \; \textit{if} \; \; \nullable\; a$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	453	& & $\partial_x \; a\cdot b \quad\quad
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	454	\textit{otherwise}$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	455	$\partial_x \; r^$ & $\dn$ & $\partial_x \; r \cdot r^$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	456	$\partial_x \; c $ & $\dn$ & $\textit{if} \; x = c \;
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	457	\textit{then} \;
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	458	\{ \ONE\} \;\;\textit{else} \; \varnothing$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	459	$\partial_x(a+b)$ & $=$ & $\partial_x(a) \cup \partial_x(b)$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	460	$\partial_x(\ONE)$ & $=$ & $\varnothing$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	461	$\partial_x(\ZERO)$ & $\dn$ & $\varnothing$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	462	\end{tabular}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	463	\end{center}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	464	\noindent
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	465	The $\cdot$ between for example
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	466	$\partial_x \; a\cdot b $
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	467	is a shorthand notation for the cartesian product
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	468	$\partial_x \; a \times \{ b\}$.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	469	%Each element in the set generated by a partial derivative
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	470	%corresponds to a (potentially partial) match
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	471	%TODO: define derivatives w.r.t string s
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	472	Rather than joining the calculated derivatives $\partial_x a$ and $\partial_x b$ together
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	473	using the $\sum$ constructor, Antimirov put them into
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	474	a set. This causes maximum de-duplication to happen,
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	475	allowing us to understand what are the "atomic" components of it.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	476	For example, To compute what regular expression $x^(xx + y)^$'s
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	477	derivative against $x$ is made of, one can do a partial derivative
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	478	of it and get two singleton sets $\{x^* \cdot (xx + y)^\}$ and $\{x \cdot (xx + y) ^ \}$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	479	from $\partial_x(x^) \cdot (xx + y) ^$ and $\partial_x((xx + y)^*)$.
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	480
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	481	The set of all possible partial derivatives is defined
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	482	as the union of derivatives w.r.t all the strings in the universe:
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	483	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	484	\begin{tabular}{lcl}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	485	$\textit{PDER}_{UNIV} \; r $ & $\dn $ & $\bigcup_{w \in A^*}\partial_w \; r$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	486	\end{tabular}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	487	\end{center}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	488	\noindent
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	489
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	490	Back to our
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	491	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	492	$((a^* + (aa)^* + \ldots + (\underbrace{a\ldots a}_{n a's})^* )^)^$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	493	\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	494	example, if we denote this regular expression as $A$,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	495	we have that
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	496	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	497	$\textit{PDER}_{UNIV} \; A =
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	498	\bigcup_{i=1}^{n}\bigcup_{j=0}^{i-1} \{
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	499	(\underbrace{a \ldots a}_{\text{j a's}}\cdot
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	500	(\underbrace{a \ldots a}_{\text{i a's}})^*)\cdot A \}$,
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	501	\end{center}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	502	with exactly $n * (n + 1) / 2$ terms.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	503	This is in line with our speculation that only $n*(n+1)/2$ terms are
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	504	needed. We conjecture that $\bsimpStrong$ is also able to achieve this
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	505	upper limit in general
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	506	\begin{conjecture}\label{bsimpStrongInclusionPder}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	507	Using a suitable transformation $f$, we have
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	508	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	509	$\forall s.\; f \; (r \bdersStrong \; s) \subseteq
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	510	\textit{PDER}_{UNIV} \; r$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	511	\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	512	\end{conjecture}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	513	\noindent
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	514	because our \ref{cubicRule} will keep only one copy of each term,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	515	where the function $\textit{prune}$ takes care of maintaining
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	516	a set like structure similar to partial derivatives.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	517	It is anticipated we might need to adjust $\textit{prune}$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	518	slightly to make sure all duplicate terms are eliminated,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	519	which should be doable.
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	520
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	521	Antimirov had proven that the sum of all the partial derivative
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	522	terms' sizes is bounded by the cubic of the size of that regular
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	523	expression:
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	524	\begin{property}\label{pderBound}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	525	$\llbracket \textit{PDER}_{UNIV} \; r \rrbracket \leq O((\llbracket r \rrbracket)^3)$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	526	\end{property}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	527	This property was formalised by Urban, and the details are in the PDERIVS.thy file
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	528	in our repository.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	529	Once conjecture \ref{bsimpStrongInclusionPder} is proven, then property \ref{pderBound}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	530	would yield us a cubic bound for our $\blexerStrong$ algorithm:
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	531	\begin{conjecture}\label{strongCubic}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	532	$\llbracket r \bdersStrong\; s \rrbracket \leq \llbracket r \rrbracket^3$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	533	\end{conjecture}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	534
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	535
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	536	%To get all the "atomic" components of a regular expression's possible derivatives,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	537	%there is a procedure Antimirov called $\textit{lf}$, short for "linear forms", that takes
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	538	%whatever character is available at the head of the string inside the language of a
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	539	%regular expression, and gives back the character and the derivative regular expression
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	540	%as a pair (which he called "monomial"):
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	541	% \begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	542	% \begin{tabular}{ccc}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	543	% $\lf(\ONE)$ & $=$ & $\phi$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	544	%$\lf(c)$ & $=$ & $\{(c, \ONE) \}$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	545	% $\lf(a+b)$ & $=$ & $\lf(a) \cup \lf(b)$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	546	% $\lf(r^)$ & $=$ & $\lf(r) \bigodot \lf(r^)$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	547	%\end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	548	%\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	549	%%TODO: completion
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	550	%
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	551	%There is a slight difference in the last three clauses compared
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	552	%with $\partial$: instead of a dot operator $ \textit{rset} \cdot r$ that attaches the regular
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	553	%expression $r$ with every element inside $\textit{rset}$ to create a set of
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	554	%sequence derivatives, it uses the "circle dot" operator $\bigodot$ which operates
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	555	%on a set of monomials (which Antimirov called "linear form") and a regular
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	556	%expression, and returns a linear form:
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	557	% \begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	558	% \begin{tabular}{ccc}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	559	% $l \bigodot (\ZERO)$ & $=$ & $\phi$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	560	% $l \bigodot (\ONE)$ & $=$ & $l$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	561	% $\phi \bigodot t$ & $=$ & $\phi$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	562	% $\{ (x, \ZERO) \} \bigodot t$ & $=$ & $\{(x,\ZERO) \}$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	563	% $\{ (x, \ONE) \} \bigodot t$ & $=$ & $\{(x,t) \}$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	564	% $\{ (x, p) \} \bigodot t$ & $=$ & $\{(x,p\cdot t) \}$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	565	% $\lf(a+b)$ & $=$ & $\lf(a) \cup \lf(b)$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	566	% $\lf(r^)$ & $=$ & $\lf(r) \cdot \lf(r^)$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	567	%\end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	568	%\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	569	%%TODO: completion
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	570	%
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	571	% Some degree of simplification is applied when doing $\bigodot$, for example,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	572	% $l \bigodot (\ZERO) = \phi$ corresponds to $r \cdot \ZERO \rightsquigarrow \ZERO$,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	573	% and $l \bigodot (\ONE) = l$ to $l \cdot \ONE \rightsquigarrow l$, and
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	574	% $\{ (x, \ZERO) \} \bigodot t = \{(x,\ZERO) \}$ to $\ZERO \cdot x \rightsquigarrow \ZERO$,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	575	% and so on.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	576	%
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	577	% With the function $\lf$ one can compute all possible partial derivatives $\partial_{UNIV}(r)$ of a regular expression $r$ with
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	578	% an iterative procedure:
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	579	% \begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	580	% \begin{tabular}{llll}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	581	%$\textit{while}$ & $(\Delta_i \neq \phi)$ & & \\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	582	% & $\Delta_{i+1}$ & $ =$ & $\lf(\Delta_i) - \PD_i$ \\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	583	% & $\PD_{i+1}$ & $ =$ & $\Delta_{i+1} \cup \PD_i$ \\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	584	%$\partial_{UNIV}(r)$ & $=$ & $\PD$ &
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	585	%\end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	586	%\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	587	%
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	588	%
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	589	% $(r_1 + r_2) \cdot r_3 \longrightarrow (r_1 \cdot r_3) + (r_2 \cdot r_3)$,
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	590
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	591
532 cc54ce075db5 restructured Chengsong parents: diff changeset	592
cc54ce075db5 restructured Chengsong parents: diff changeset	593
cc54ce075db5 restructured Chengsong parents: diff changeset	594	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	595	% SECTION 2
cc54ce075db5 restructured Chengsong parents: diff changeset	596	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	597
cc54ce075db5 restructured Chengsong parents: diff changeset	598	\section{Bounded Repetitions}
612 8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	599	We have promised in chapter \ref{Introduction}
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	600	that our lexing algorithm can potentially be extended
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	601	to handle bounded repetitions
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	602	in natural and elegant ways.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	603	Now we fulfill our promise by adding support for
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	604	the ``exactly-$n$-times'' bounded regular expression $r^{\{n\}}$.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	605	We add clauses in our derivatives-based lexing algorithms (with simplifications)
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	606	introduced in chapter \ref{Bitcoded2}.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	607
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	608	\subsection{Augmented Definitions}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	609	There are a number of definitions that need to be augmented.
612 8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	610	The most notable one would be the POSIX rules for $r^{\{n\}}$:
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	611	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	612	\begin{mathpar}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	613	\inferrule{\forall v \in vs_1. \vdash v:r \land
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	614	\|v\| \neq []\\ \forall v \in vs_2. \vdash v:r \land \|v\| = []\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	615	\textit{length} \; (vs_1 @ vs_2) = n}{\textit{Stars} \;
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	616	(vs_1 @ vs_2) : r^{\{n\}} }
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	617	\end{mathpar}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	618	\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	619	As Ausaf had pointed out \cite{Ausaf},
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	620	sometimes empty iterations have to be taken to get
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	621	a match with exactly $n$ repetitions,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	622	and hence the $vs_2$ part.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	623
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	624	Another important definition would be the size:
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	625	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	626	\begin{tabular}{lcl}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	627	$\llbracket r^{\{n\}} \rrbracket_r$ & $\dn$ &
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	628	$\llbracket r \rrbracket_r + n$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	629	\end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	630	\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	631	\noindent
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	632	Arguably we should use $\log \; n$ for the size because
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	633	the number of digits increase logarithmically w.r.t $n$.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	634	For simplicity we choose to add the counter directly to the size.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	635
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	636	The derivative w.r.t a bounded regular expression
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	637	is given as
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	638	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	639	\begin{tabular}{lcl}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	640	$r^{\{n\}} \backslash_r c$ & $\dn$ &
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	641	$r\backslash_r c \cdot r^{\{n-1\}} \;\; \textit{if} \; n \geq 1$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	642	& & $\RZERO \;\quad \quad\quad \quad
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	643	\textit{otherwise}$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	644	\end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	645	\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	646	\noindent
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	647	For brevity, we sometimes use NTIMES to refer to bounded
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	648	regular expressions.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	649	The $\mkeps$ function clause for NTIMES would be
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	650	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	651	\begin{tabular}{lcl}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	652	$\mkeps \; r^{\{n\}} $ & $\dn$ & $\Stars \;
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	653	(\textit{replicate} \; n\; (\mkeps \; r))$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	654	\end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	655	\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	656	\noindent
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	657	The injection looks like
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	658	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	659	\begin{tabular}{lcl}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	660	$\inj \; r^{\{n\}} \; c\; (\Seq \;v \; (\Stars \; vs)) $ &
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	661	$\dn$ & $\Stars \;
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	662	((\inj \; r \;c \;v ) :: vs)$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	663	\end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	664	\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	665	\noindent
612 8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	666
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	667
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	668	\subsection{Proofs for the Augmented Lexing Algorithm}
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	669	We need to maintain two proofs with the additional $r^{\{n\}}$
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	670	construct: the
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	671	correctness proof in chapter \ref{Bitcoded2},
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	672	and the finiteness proof in chapter \ref{Finite}.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	673
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	674	\subsubsection{Correctness Proof Augmentation}
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	675	The correctness of $\textit{lexer}$ and $\textit{blexer}$ with bounded repetitions
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	676	have been proven by Ausaf and Urban\cite{AusafDyckhoffUrban2016}.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	677	As they have commented, once the definitions are in place,
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	678	the proofs given for the basic regular expressions will extend to
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	679	bounded regular expressions, and there are no ``surprises''.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	680	We confirm this point because the correctness theorem would also
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	681	extend without surprise to $\blexersimp$.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	682	The rewrite rules such as $\rightsquigarrow$, $\stackrel{s}{\rightsquigarrow}$ and so on
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	683	do not need to be changed,
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	684	and only a few lemmas such as lemma \ref{fltsPreserves} need to be adjusted to
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	685	add one more line which can be solved by sledgehammer
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	686	to solve the $r^{\{n\}}$ inductive case.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	687
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	688
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	689	\subsubsection{Finiteness Proof Augmentation}
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	690	The bounded repetitions are
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	691	very similar to stars, and therefore the treatment
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	692	is similar, with minor changes to handle some slight complications
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	693	when the counter reaches 0.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	694	The exponential growth is similar:
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	695	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	696	\begin{tabular}{ll}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	697	$r^{\{n\}} $ & $\longrightarrow_{\backslash c}$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	698	$(r\backslash c) \cdot
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	699	r^{\{n - 1\}}*$ & $\longrightarrow_{\backslash c'}$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	700	\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	701	$r \backslash cc' \cdot r^{\{n - 2\}}* +
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	702	r \backslash c' \cdot r^{\{n - 1\}}*$ &
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	703	$\longrightarrow_{\backslash c''}$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	704	\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	705	$(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* +
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	706	r \backslash c''\cdot r^{\{n-1\}}) +
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	707	(r \backslash c'c'' \cdot r^{\{n-2\}}* +
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	708	r \backslash c'' \cdot r^{\{n-1\}}*)$ &
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	709	$\longrightarrow_{\backslash c'''}$ \\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	710	\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	711	$\ldots$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	712	\end{tabular}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	713	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	714	Again, we assume that $r\backslash c$, $r \backslash cc'$ and so on
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	715	are all nullable.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	716	The flattened list of terms for $r^{\{n\}} \backslash_{rs} s$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	717	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	718	$[r_1 \backslash cc'c'' \cdot r^{\{n-3\}}*,\;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	719	r \backslash c''\cdot r^{\{n-1\}}, \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	720	r \backslash c'c'' \cdot r^{\{n-2\}}*, \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	721	r \backslash c'' \cdot r^{\{n-1\}}*,\; \ldots ]$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	722	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	723	that comes from
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	724	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	725	$(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* +
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	726	r \backslash c''\cdot r^{\{n-1\}}) +
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	727	(r \backslash c'c'' \cdot r^{\{n-2\}}* +
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	728	r \backslash c'' \cdot r^{\{n-1\}}*)+ \ldots$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	729	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	730	are made of sequences with different tails, where the counters
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	731	might differ.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	732	The observation for maintaining the bound is that
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	733	these counters never exceed $n$, the original
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	734	counter. With the number of counters staying finite,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	735	$\rDistinct$ will deduplicate and keep the list finite.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	736	We introduce this idea as a lemma once we describe all
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	737	the necessary helper functions.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	738
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	739	Similar to the star case, we want
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	740	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	741	$\rderssimp{r^{\{n\}}}{s} = \rsimp{\sum rs}$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	742	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	743	where $rs$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	744	shall be in the form of
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	745	$\map \; f \; Ss$, where $f$ is a function and
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	746	$Ss$ a list of objects to act on.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	747	For star, the object's datatype is string.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	748	The list of strings $Ss$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	749	is generated using functions
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	750	$\starupdate$ and $\starupdates$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	751	The function that takes a string and returns a regular expression
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	752	is the anonymous function $
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	753	(\lambda s'. \; r\backslash s' \cdot r^{\{m\}})$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	754	In the NTIMES setting,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	755	the $\starupdate$ and $\starupdates$ functions are replaced by
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	756	$\textit{nupdate}$ and $\textit{nupdates}$:
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	757	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	758	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	759	$\nupdate \; c \; r \; [] $ & $\dn$ & $[]$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	760	$\nupdate \; c \; r \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	761	(\Some \; (s, \; n + 1) \; :: \; Ss)$ & $\dn$ & %\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	762	$\textit{if} \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	763	(\rnullable \; (r \backslash_{rs} s))$ \\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	764	& & $\;\;\textit{then}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	765	\;\; \Some \; (s @ [c], n + 1) :: \Some \; ([c], n) :: (
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	766	\nupdate \; c \; r \; Ss)$ \\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	767	& & $\textit{else} \;\; \Some \; (s @ [c], n+1) :: (
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	768	\nupdate \; c \; r \; Ss)$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	769	$\nupdate \; c \; r \; (\textit{None} :: Ss)$ & $\dn$ &
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	770	$(\None :: \nupdate \; c \; r \; Ss)$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	771	& & \\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	772	%\end{tabular}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	773	%\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	774	%\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	775	%\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	776	$\nupdates \; [] \; r \; Ss$ & $\dn$ & $Ss$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	777	$\nupdates \; (c :: cs) \; r \; Ss$ & $\dn$ & $\nupdates \; cs \; r \; (
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	778	\nupdate \; c \; r \; Ss)$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	779	\end{tabular}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	780	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	781	\noindent
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	782	which take into account when a subterm
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	783	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	784	$r \backslash_s s \cdot r^{\{n\}}$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	785	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	786	counter $n$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	787	is 0, and therefore expands to
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	788	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	789	$r \backslash_s (s@[c]) \cdot r^{\{n\}} \;+
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	790	\; \ZERO$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	791	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	792	after taking a derivative.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	793	The object now has type
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	794	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	795	$\textit{option} \;(\textit{string}, \textit{nat})$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	796	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	797	and therefore the function for converting such an option into
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	798	a regular expression term is called $\opterm$:
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	799
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	800	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	801	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	802	$\opterm \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	803	& & $\;\;\Some \; (s, n) \Rightarrow
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	804	(r\backslash_{rs} s)\cdot r^{\{n\}}$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	805	& & $\;\;\None \Rightarrow
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	806	\ZERO$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	807	\end{tabular}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	808	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	809	\noindent
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	810	Put together, the list $\map \; f \; Ss$ is instantiated as
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	811	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	812	$\map \; (\opterm \; r) \; (\nupdates \; s \; r \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	813	[\Some \; ([c], n)])$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	814	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	815	For the closed form to be bounded, we would like
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	816	simplification to be applied to each term in the list.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	817	Therefore we introduce some variants of $\opterm$,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	818	which help conveniently express the rewriting steps
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	819	needed in the closed form proof.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	820	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	821	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	822	$\optermOsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	823	& & $\;\;\Some \; (s, n) \Rightarrow
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	824	\textit{rsimp} \; ((r\backslash_{rs} s)\cdot r^{\{n\}})$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	825	& & $\;\;\None \Rightarrow
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	826	\ZERO$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	827	\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	828	$\optermosimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	829	& & $\;\;\Some \; (s, n) \Rightarrow
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	830	(\textit{rsimp} \; (r\backslash_{rs} s))
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	831	\cdot r^{\{n\}}$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	832	& & $\;\;\None \Rightarrow
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	833	\ZERO$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	834	\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	835	$\optermsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	836	& & $\;\;\Some \; (s, n) \Rightarrow
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	837	(r\backslash_{rsimps} s)\cdot r^{\{n\}}$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	838	& & $\;\;\None \Rightarrow
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	839	\ZERO$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	840	\end{tabular}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	841	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	842
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	843
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	844	For a list of
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	845	$\textit{option} \;(\textit{string}, \textit{nat})$ elements,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	846	we define the highest power for it recursively:
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	847	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	848	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	849	$\hpa \; [] \; n $ & $\dn$ & $n$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	850	$\hpa \; (\None :: os) \; n $ & $\dn$ & $\hpa \; os \; n$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	851	$\hpa \; (\Some \; (s, n) :: os) \; m$ & $\dn$ &
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	852	$\hpa \;os \; (\textit{max} \; n\; m)$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	853	\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	854	$\hpower \; rs $ & $\dn$ & $\hpa \; rs \; 0$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	855	\end{tabular}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	856	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	857
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	858	Now the intuition that an NTIMES regular expression's power
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	859	does not increase can be easily expressed as
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	860	\begin{lemma}\label{nupdatesMono2}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	861	$\hpower \; (\nupdates \;s \; r \; [\Some \; ([c], n)]) \leq n$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	862	\end{lemma}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	863	\begin{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	864	Note that the power is non-increasing after a $\nupdate$ application:
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	865	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	866	$\hpa \;\; (\nupdate \; c \; r \; Ss)\;\; m \leq
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	867	\hpa\; \; Ss \; m$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	868	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	869	This is also the case for $\nupdates$:
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	870	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	871	$\hpa \;\; (\nupdates \; s \; r \; Ss)\;\; m \leq
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	872	\hpa\; \; Ss \; m$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	873	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	874	Therefore we have that
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	875	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	876	$\hpower \;\; (\nupdates \; s \; r \; Ss) \leq
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	877	\hpower \;\; Ss$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	878	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	879	which leads to the lemma being proven.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	880
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	881	\end{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	882
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	883
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	884	We also define the inductive rules for
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	885	the shape of derivatives of the NTIMES regular expressions:\\[-3em]
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	886	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	887	\begin{mathpar}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	888	\inferrule{\mbox{}}{\cbn \;\ZERO}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	889
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	890	\inferrule{\mbox{}}{\cbn \; \; r_a \cdot (r^{\{n\}})}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	891
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	892	\inferrule{\cbn \; r_1 \;\; \; \cbn \; r_2}{\cbn \; r_1 + r_2}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	893
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	894	\inferrule{\cbn \; r}{\cbn \; r + \ZERO}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	895	\end{mathpar}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	896	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	897	\noindent
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	898	A derivative of NTIMES fits into the shape described by $\cbn$:
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	899	\begin{lemma}\label{ntimesDersCbn}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	900	$\cbn \; ((r' \cdot r^{\{n\}}) \backslash_{rs} s)$ holds.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	901	\end{lemma}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	902	\begin{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	903	By a reverse induction on $s$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	904	For the inductive case, note that if $\cbn \; r$ holds,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	905	then $\cbn \; (r\backslash_r c)$ holds.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	906	\end{proof}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	907	\noindent
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	908	In addition, for $\cbn$-shaped regular expressioins, one can flatten
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	909	them:
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	910	\begin{lemma}\label{ntimesHfauPushin}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	911	If $\cbn \; r$ holds, then $\hflataux{r \backslash_r c} =
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	912	\textit{concat} \; (\map \; \hflataux{\map \; (\_\backslash_r c) \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	913	(\hflataux{r})})$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	914	\end{lemma}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	915	\begin{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	916	By an induction on the inductive cases of $\cbn$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	917	\end{proof}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	918	\noindent
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	919	This time we do not need to define the flattening functions for NTIMES only,
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	920	because $\hflat{\_}$ and $\hflataux{\_}$ work on NTIMES already.
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	921	\begin{lemma}\label{ntimesHfauInduct}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	922	$\hflataux{( (r\backslash_r c) \cdot r^{\{n\}}) \backslash_{rsimps} s} =
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	923	\map \; (\opterm \; r) \; (\nupdates \; s \; r \; [\Some \; ([c], n)])$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	924	\end{lemma}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	925	\begin{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	926	By a reverse induction on $s$.
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	927	The lemmas \ref{ntimesHfauPushin} and \ref{ntimesDersCbn} are used.
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	928	\end{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	929	\noindent
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	930	We have a recursive property for NTIMES with $\nupdate$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	931	similar to that for STAR,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	932	and one for $\nupdates $ as well:
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	933	\begin{lemma}\label{nupdateInduct1}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	934	\mbox{}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	935	\begin{itemize}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	936	\item
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	937	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	938	$\textit{concat} \; (\map \; (\hflataux{\_} \circ (
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	939	\opterm \; r)) \; Ss) = \map \; (\opterm \; r) \; (\nupdate \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	940	c \; r \; Ss)$\\
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	941	\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	942	holds.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	943	\item
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	944	\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	945	$\textit{concat} \; (\map \; \hflataux{\_}\;
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	946	\map \; (\_\backslash_r x) \;
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	947	(\map \; (\opterm \; r) \; (\nupdates \; xs \; r \; Ss)))$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	948	$=$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	949	$\map \; (\opterm \; r) \; (\nupdates \;(xs@[x]) \; r\;Ss)$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	950	\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	951	holds.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	952	\end{itemize}
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	953	\end{lemma}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	954	\begin{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	955	(i) is by an induction on $Ss$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	956	(ii) is by an induction on $xs$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	957	\end{proof}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	958	\noindent
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	959	The $\nString$ predicate is defined for conveniently
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	960	expressing that there are no empty strings in the
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	961	$\Some \;(s, n)$ elements generated by $\nupdate$:
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	962	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	963	\begin{tabular}{lcl}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	964	$\nString \; \None$ & $\dn$ & $ \textit{true}$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	965	$\nString \; (\Some \; ([], n))$ & $\dn$ & $ \textit{false}$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	966	$\nString \; (\Some \; (c::s, n))$ & $\dn$ & $ \textit{true}$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	967	\end{tabular}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	968	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	969	\begin{lemma}\label{nupdatesNonempty}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	970	If for all elements $o \in \textit{set} \; Ss$,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	971	$\nString \; o$ holds, the we have that
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	972	for all elements $o' \in \textit{set} \; (\nupdates \; s \; r \; Ss)$,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	973	$\nString \; o'$ holds.
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	974	\end{lemma}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	975	\begin{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	976	By an induction on $s$, where $Ss$ is set to vary over all possible values.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	977	\end{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	978
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	979	\noindent
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	980
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	981	\begin{lemma}\label{ntimesClosedFormsSteps}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	982	The following list of equalities or rewriting relations hold:\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	983	(i) $r^{\{n+1\}} \backslash_{rsimps} (c::s) =
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	984	\textit{rsimp} \; (\sum (\map \; (\opterm \;r \;\_) \; (\nupdates \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	985	s \; r \; [\Some \; ([c], n)])))$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	986	(ii)
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	987	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	988	$\sum (\map \; (\opterm \; r) \; (\nupdates \; s \; r \; [
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	989	\Some \; ([c], n)]))$ \\ $ \sequal$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	990	$\sum (\map \; (\textit{rsimp} \circ (\opterm \; r))\; (\nupdates \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	991	s\;r \; [\Some \; ([c], n)]))$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	992	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	993	(iii)
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	994	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	995	$\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	996	([c], n)]))$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	997	$\sequal$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	998	$\sum \;(\map \; (\optermsimp r) \; (\nupdates \; s \; r \; [\Some \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	999	([c], n)])) $\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1000	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1001	(iv)
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1002	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1003	$\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1004	([c], n)])) $ \\ $\sequal$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1005	$\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1006	([c], n)])) $\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1007	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1008	(v)
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1009	\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1010	$\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1011	([c], n)])) $ \\ $\sequal$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1012	$\sum \; (\map \; (\textit{rsimp} \circ (\opterm \; r)) \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1013	(\nupdates \; s \; r \; [\Some \; ([c], n)]))$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1014	\end{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1015	\end{lemma}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1016	\begin{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1017	Routine.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1018	(iii) and (iv) make use of the fact that all the strings $s$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1019	inside $\Some \; (s, m)$ which are elements of the list
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1020	$\nupdates \; s\;r\;[\Some\; ([c], n)]$ are non-empty,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1021	which is from lemma \ref{nupdatesNonempty}.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1022	Once the string in $o = \Some \; (s, n)$ is
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1023	nonempty, $\optermsimp \; r \;o$,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1024	$\optermosimp \; r \; o$ and $\optermosimp \; \; o$ are guaranteed
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1025	to be equal.
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1026	(v) uses \ref{nupdateInduct1}.
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1027	\end{proof}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1028	\noindent
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1029	Now we are ready to present the closed form for NTIMES:
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1030	\begin{theorem}\label{ntimesClosedForm}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1031	The derivative of $r^{\{n+1\}}$ can be described as an alternative
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1032	containing a list
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1033	of terms:\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1034	$r^{\{n+1\}} \backslash_{rsimps} (c::s) = \textit{rsimp} \; (
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1035	\sum (\map \; (\optermsimp \; r) \; (\nupdates \; s \; r \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1036	[\Some \; ([c], n)])))$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1037	\end{theorem}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1038	\begin{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1039	By the rewriting steps described in lemma \ref{ntimesClosedFormsSteps}.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1040	\end{proof}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1041	\noindent
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1042	The key observation for bounding this closed form
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1043	is that the counter on $r^{\{n\}}$ will
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1044	only decrement during derivatives:
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1045	\begin{lemma}\label{nupdatesNLeqN}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1046	For an element $o$ in $\textit{set} \; (\nupdates \; s \; r \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1047	[\Some \; ([c], n)])$, either $o = \None$, or $o = \Some
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1048	\; (s', m)$ for some string $s'$ and number $m \leq n$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1049	\end{lemma}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1050	\noindent
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1051	The proof is routine and therefore omitted.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1052	This allows us to say what kind of terms
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1053	are in the list $\textit{set} \; (\map \; (\optermsimp \; r) \; (
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1054	\nupdates \; s \; r \; [\Some \; ([c], n)]))$:
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1055	only $\ZERO_r$s or a sequence with the tail an $r^{\{m\}}$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1056	with a small $m$:
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1057	\begin{lemma}\label{ntimesClosedFormListElemShape}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1058	For any element $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; (
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1059	\nupdates \; s \; r \; [\Some \; ([c], n)]))$,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1060	we have that $r'$ is either $\ZERO$ or $r \backslash_{rsimps} s' \cdot
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1061	r^{\{m\}}$ for some string $s'$ and number $m \leq n$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1062	\end{lemma}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1063	\begin{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1064	Using lemma \ref{nupdatesNLeqN}.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1065	\end{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1066
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1067	\begin{theorem}\label{ntimesClosedFormBounded}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1068	Assuming that for any string $s$, $\llbracket r \backslash_{rsimps} s
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1069	\rrbracket_r \leq N$ holds, then we have that\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1070	$\llbracket r^{\{n+1\}} \backslash_{rsimps} s \rrbracket_r \leq
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1071	\textit{max} \; (c_N+1)* (N + \llbracket r^{\{n\}} \rrbracket+1)$,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1072	where $c_N = \textit{card} \; (\textit{sizeNregex} \; (
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1073	N + \llbracket r^{\{n\}} \rrbracket_r+1))$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1074	\end{theorem}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1075	\begin{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1076	We have that for all regular expressions $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; (
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1077	\nupdates \; s \; r \; [\Some \; ([c], n)]))$,
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1078	$r'$'s size is less than or equal to $N + \llbracket r^{\{n\}}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1079	\rrbracket_r + 1$
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1080	because $r'$ can only be either a $\ZERO$ or $r \backslash_{rsimps} s' \cdot
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1081	r^{\{m\}}$ for some string $s'$ and number
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1082	$m \leq n$ (lemma \ref{ntimesClosedFormListElemShape}).
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1083	In addition, we know that the list
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1084	$\map \; (\optermsimp \; r) \; (
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1085	\nupdates \; s \; r \; [\Some \; ([c], n)])$'s size is at most
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1086	$c_N = \textit{card} \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1087	(\sizeNregex \; ((N + \llbracket r^{\{n\}} \rrbracket) + 1))$.
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1088	This gives us $\llbracket r \backslash_{rsimps} \;s \rrbracket_r
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1089	\leq N * c_N$.
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1090	\end{proof}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1091
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1092	We aim to formalise the correctness and size bound
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1093	for constructs like $r^{\{\ldots n\}}$, $r^{\{n \ldots\}}$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1094	and so on, which is still work in progress.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1095	They should more or less follow the same recipe described in this section.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1096	Once we know about how to deal with them recursively using suitable auxiliary
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1097	definitions, we are able to routinely establish the proofs.
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1098
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1099
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1100	%The closed form for them looks like:
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1101	%%\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1102	%% \begin{tabular}{llrclll}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1103	%% $r^{\{n+1\}}$ & $ \backslash_{rsimps}$ & $(c::s)$ & $=$ & & \\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1104	%% $\textit{rsimp}$ & $($ & $
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1105	%% \sum \; ( $ & $\map$ & $(\textit{optermsimp}\;r)$ & $($\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1106	%% & & & & $\textit{nupdates} \;$ &
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1107	%% $ s \; r_0 \; [ \textit{Some} \; ([c], n)]$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1108	%% & & & & $)$ &\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1109	%% & & $)$ & & &\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1110	%% & $)$ & & & &\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1111	%% \end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1112	%%\end{center}
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1113	%\begin{center}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1114	% \begin{tabular}{llrcllrllll}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1115	% $r^{\{n+1\}}$ & $ \backslash_{rsimps}$ & $(c::s)$ & $=$ & & &&&&\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1116	% &&&&$\textit{rsimp}$ & $($ & $
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1117	% \sum \; ( $ & $\map$ & $(\textit{optermsimp}\;r)$ & $($\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1118	% &&&& & & & & $\;\; \textit{nupdates} \;$ &
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1119	% $ s \; r_0 \; [ \textit{Some} \; ([c], n)]$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1120	% &&&& & & & & $)$ &\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1121	% &&&& & & $)$ & & &\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1122	% &&&& & $)$ & & & &\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1123	% \end{tabular}
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1124	%\end{center}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1125	%The $\textit{optermsimp}$ function with the argument $r$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1126	%chooses from two options: $\ZERO$ or
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1127	%We define for the $r^{\{n\}}$ constructor something similar to $\starupdate$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1128	%and $\starupdates$:
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1129	%\begin{center}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1130	% \begin{tabular}{lcl}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1131	% $\starupdate \; c \; r \; [] $ & $\dn$ & $[]$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1132	% $\starupdate \; c \; r \; (s :: Ss)$ & $\dn$ & \\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1133	% & & $\textit{if} \;
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1134	% (\rnullable \; (\rders \; r \; s))$ \\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1135	% & & $\textit{then} \;\; (s @ [c]) :: [c] :: (
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1136	% \starupdate \; c \; r \; Ss)$ \\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1137	% & & $\textit{else} \;\; (s @ [c]) :: (
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1138	% \starupdate \; c \; r \; Ss)$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1139	% \end{tabular}
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1140	%\end{center}
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1141	%\noindent
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1142	%As a generalisation from characters to strings,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1143	%$\starupdates$ takes a string instead of a character
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1144	%as the first input argument, and is otherwise the same
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1145	%as $\starupdate$.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1146	%\begin{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1147	% \begin{tabular}{lcl}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1148	% $\starupdates \; [] \; r \; Ss$ & $=$ & $Ss$\\
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1149	% $\starupdates \; (c :: cs) \; r \; Ss$ & $=$ & $\starupdates \; cs \; r \; (
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1150	% \starupdate \; c \; r \; Ss)$
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1151	% \end{tabular}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1152	%\end{center}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1153	%\noindent
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1154
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1155
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1156
621 17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1157	%\section{Zippers}
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1158	%Zipper is a data structure designed to operate on
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1159	%and navigate between local parts of a tree.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1160	%It was first formally described by Huet \cite{HuetZipper}.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1161	%Typical applications of zippers involve text editor buffers
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1162	%and proof system databases.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1163	%In our setting, the idea is to compactify the representation
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1164	%of derivatives with zippers, thereby making our algorithm faster.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1165	%Some initial results
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1166	%We first give a brief introduction to what zippers are,
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1167	%and other works
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1168	%that apply zippers to derivatives
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1169	%When dealing with large trees, it would be a waste to
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1170	%traverse the entire tree if
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1171	%the operation only
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1172	%involves a small fraction of it.
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1173	%The idea is to put the focus on that subtree, turning other parts
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1174	%of the tree into a context
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1175	%
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1176	%
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1177	%One observation about our derivative-based lexing algorithm is that
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1178	%the derivative operation sometimes traverses the entire regular expression
17c7611fb0a9 chap6 Chengsong parents: 620 diff changeset	1179	%unnecessarily:
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1180
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1181
612 8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1182	%----------------------------------------------------------------------------------------
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1183	% SECTION 1
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1184	%----------------------------------------------------------------------------------------
532 cc54ce075db5 restructured Chengsong parents: diff changeset	1185
612 8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1186	%\section{Adding Support for the Negation Construct, and its Correctness Proof}
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1187	%We now add support for the negation regular expression:
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1188	%\[ r ::= \ZERO \mid \ONE
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1189	% \mid c
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1190	% \mid r_1 \cdot r_2
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1191	% \mid r_1 + r_2
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1192	% \mid r^*
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1193	% \mid \sim r
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1194	%\]
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1195	%The $\textit{nullable}$ function's clause for it would be
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1196	%\[
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1197	%\textit{nullable}(~r) = \neg \nullable(r)
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1198	%\]
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1199	%The derivative would be
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1200	%\[
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1201	%~r \backslash c = ~ (r \backslash c)
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1202	%\]
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1203	%
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1204	%The most tricky part of lexing for the $~r$ regular expression
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1205	% is creating a value for it.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1206	% For other regular expressions, the value aligns with the
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1207	% structure of the regular expression:
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1208	% \[
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1209	% \vdash \Seq(\Char(a), \Char(b)) : a \cdot b
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1210	% \]
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1211	%But for the $~r$ regular expression, $s$ is a member of it if and only if
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1212	%$s$ does not belong to $L(r)$.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1213	%That means when there
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1214	%is a match for the not regular expression, it is not possible to generate how the string $s$ matched
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1215	%with $r$.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1216	%What we can do is preserve the information of how $s$ was not matched by $r$,
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1217	%and there are a number of options to do this.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1218	%
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1219	%We could give a partial value when there is a partial match for the regular expression inside
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1220	%the $\mathbf{not}$ construct.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1221	%For example, the string $ab$ is not in the language of $(a\cdot b) \cdot c$,
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1222	%A value for it could be
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1223	% \[
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1224	% \vdash \textit{Not}(\Seq(\Char(a), \Char(b))) : ~((a \cdot b ) \cdot c)
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1225	% \]
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1226	% The above example demonstrates what value to construct
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1227	% when the string $s$ is at most a real prefix
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1228	% of the strings in $L(r)$. When $s$ instead is not a prefix of any strings
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1229	% in $L(r)$, it becomes unclear what to return as a value inside the $\textit{Not}$
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1230	% constructor.
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1231	%
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1232	% Another option would be to either store the string $s$ that resulted in
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1233	% a mis-match for $r$ or a dummy value as a placeholder:
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1234	% \[
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1235	% \vdash \textit{Not}(abcd) : ~( r_1 )
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1236	% \]
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1237	%or
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1238	% \[
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1239	% \vdash \textit{Not}(\textit{Dummy}) : ~( r_1 )
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1240	% \]
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1241	% We choose to implement this as it is most straightforward:
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1242	% \[
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1243	% \mkeps(~(r)) = \textit{if}(\nullable(r)) \; \textit{Error} \; \textit{else} \; \textit{Not}(\textit{Dummy})
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1244	% \]
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1245	%
8c234a1bc7e0 chap6 Chengsong parents: 596 diff changeset	1246	%
620 ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1247	%\begin{center}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1248	% \begin{tabular}{lcl}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1249	% $\ntset \; r \; (n+1) \; c::cs $ & $\dn$ & $\nupdates \;
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1250	% cs \; r \; [\Some \; ([c], n)]$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1251	% $\ntset \; r\; 0 \; \_$ & $\dn$ & $\None$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1252	% $\ntset \; r \; \_ \; [] $ & $ \dn$ & $[]$\\
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1253	% \end{tabular}
ae6010c14e49 chap6 almost done Chengsong parents: 613 diff changeset	1254	%\end{center}

author	Chengsong
	Tue, 08 Nov 2022 23:24:54 +0000
changeset 621	17c7611fb0a9
parent 620	ae6010c14e49
child 625	b797c9a709d9
permissions	-rwxr-xr-x