lexing: ChengsongTanPhdThesis/Chapters/Cubic.tex@62f8fa03863e (annotated)

532 cc54ce075db5 restructured Chengsong parents: diff changeset	1	% Chapter Template
cc54ce075db5 restructured Chengsong parents: diff changeset	2
cc54ce075db5 restructured Chengsong parents: diff changeset	3	\chapter{A Better Bound and Other Extensions} % Main chapter title
cc54ce075db5 restructured Chengsong parents: diff changeset	4
cc54ce075db5 restructured Chengsong parents: diff changeset	5	\label{Cubic} %In Chapter 5\ref{Chapter5} we discuss stronger simplifications to improve the finite bound
cc54ce075db5 restructured Chengsong parents: diff changeset	6	%in Chapter 4 to a polynomial one, and demonstrate how one can extend the
cc54ce075db5 restructured Chengsong parents: diff changeset	7	%algorithm to include constructs such as bounded repetitions and negations.
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	8	\lstset{style=myScalastyle}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	9
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	10
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	11	This chapter is a ``miscellaneous''
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	12	chapter which records various
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	13	extensions to our $\blexersimp$'s formalisations.\\
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	14	Firstly we present further improvements
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	15	made to our lexer algorithm $\blexersimp$.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	16	We devise a stronger simplification algorithm,
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	17	called $\bsimpStrong$, which can prune away
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	18	similar components in two regular expressions at the same
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	19	alternative level,
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	20	even if these regular expressions are not exactly the same.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	21	We call the lexer that uses this stronger simplification function
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	22	$\blexerStrong$.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	23	We conjecture that both
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	24	\begin{center}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	25	$\blexerStrong \;r \; s = \blexer\; r\;s$
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	26	\end{center}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	27	and
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	28	\begin{center}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	29	$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	30	\end{center}
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	31	hold, but formalising
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	32	them is still work in progress.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	33	We give reasons why the correctness and cubic size bound proofs
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	34	can be achieved,
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	35	by exploring the connection between the internal
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	36	data structure of our $\blexerStrong$ and
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	37	Animirov's partial derivatives.\\
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	38	Secondly, we extend our $\blexersimp$
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	39	to support bounded repetitions ($r^{\{n\}}$).
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	40	We update our formalisation of
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	41	the correctness and finiteness properties to
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	42	include this new construct. With bounded repetitions
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	43	we are able to out-compete other verified lexers such as
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	44	Verbatim++ on regular expressions which involve a lot of
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	45	repetitions. We also present the idempotency property proof
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	46	of $\bsimp$, which leverages the idempotency proof of $\rsimp$.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	47	This reinforces our claim that the fixpoint construction
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	48	originally required by Sulzmann and Lu can be removed in $\blexersimp$.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	49	\\
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	50	Last but not least, we present our efforts and challenges we met
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	51	in further improving the algorithm by data
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	52	structures such as zippers.
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	53
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	54
988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	55
532 cc54ce075db5 restructured Chengsong parents: diff changeset	56	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	57	% SECTION strongsimp
cc54ce075db5 restructured Chengsong parents: diff changeset	58	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	59	\section{A Stronger Version of Simplification}
cc54ce075db5 restructured Chengsong parents: diff changeset	60	%TODO: search for isabelle proofs of algorithms that check equivalence
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	61	In our bitcoded lexing algorithm, (sub)terms represent (sub)matches.
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	62	For example, the regular expression
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	63	\[
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	64	aa \cdot a^+ a \cdot a^ + aa\cdot a^*
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	65	\]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	66	contains three terms,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	67	expressing three possibilities it will match future input.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	68	The first and the third terms are identical, which means we can eliminate
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	69	the latter as we know it will not be picked up by $\bmkeps$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	70	In $\bsimps$, the $\distinctBy$ function takes care of this.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	71	The criteria $\distinctBy$ uses for removing a duplicate
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	72	$a_2$ in the list
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	73	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	74	$rs_a@[a_1]@rs_b@[a_2]@rs_c$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	75	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	76	is that
533 6acbc939af6a more Chengsong parents: 532 diff changeset	77	\begin{center}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	78	$\rerase{a_1} = \rerase{a_2}$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	79	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	80	It can be characterised as the $LD$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	81	rewrite rule in \ref{rrewriteRules}.\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	82	The problem , however, is that identical components
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	83	in two slightly different regular expressions cannot be removed:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	84	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	85	\[
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	86	(a+b+d) \cdot r_1 + (a+c+e) \cdot r_1 \stackrel{?}{\rightsquigarrow} (a+b+d) \cdot r_1 + (c+e) \cdot r_1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	87	\]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	88	\caption{Desired simplification, but not done in $\blexersimp$}\label{partialDedup}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	89	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	90	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	91	A simplification like this actually
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	92	cannot be omitted,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	93	as without it the size could blow up even with our $\simp$ function:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	94	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	95	\centering
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	96	\begin{tikzpicture}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	97	\begin{axis}[
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	98	%xlabel={$n$},
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	99	myplotstyle,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	100	xlabel={input length},
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	101	ylabel={size},
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	102	]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	103	\addplot[blue,mark=*, mark options={fill=white}] table {bsimpExponential.data};
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	104	\end{axis}
533 6acbc939af6a more Chengsong parents: 532 diff changeset	105	\end{tikzpicture}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	106	\caption{Runtime of $\blexersimp$ for matching
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	107	$\protect((a^* + (aa)^* + \ldots + (aaaaa)^* )^)^$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	108	with strings
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	109	of the form $\protect\underbrace{aa..a}_{n}$.}\label{blexerExp}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	110	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	111	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	112	We would like to apply the rewriting at some stage
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	113	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	114	\[
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	115	(a+b+d) \cdot r_1 \longrightarrow a \cdot r_1 + b \cdot r_1 + d \cdot r_1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	116	\]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	117	\caption{Desired simplification, but not done in $\blexersimp$}\label{desiredSimp}
533 6acbc939af6a more Chengsong parents: 532 diff changeset	118	\end{figure}
6acbc939af6a more Chengsong parents: 532 diff changeset	119	\noindent
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	120	in our $\simp$ function,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	121	so that it makes the simplification in \ref{partialDedup} possible.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	122	For example,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	123	can a function like the below one be used?
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	124	%TODO: simp' that spills things
533 6acbc939af6a more Chengsong parents: 532 diff changeset	125
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	126	Unfortunately,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	127	if we introduce them in our
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	128	setting we would lose the POSIX property of our calculated values.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	129	For example given the regular expression
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	130	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	131	$(a + ab)(bc + c)$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	132	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	133	and the string
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	134	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	135	$ab$,
533 6acbc939af6a more Chengsong parents: 532 diff changeset	136	\end{center}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	137	then our algorithm generates the following
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	138	correct POSIX value
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	139	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	140	$\Seq \; (\Right \; ab) \; (\Right \; c)$.
533 6acbc939af6a more Chengsong parents: 532 diff changeset	141	\end{center}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	142	Essentially it matches the string with the longer Right-alternative
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	143	in the first sequence (and
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	144	then the 'rest' with the character regular expression $c$ from the second sequence).
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	145	If we add the simplification above, then we obtain the following value
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	146	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	147	$\Left \; (\Seq \; a \; (\Left \; bc))$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	148	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	149	where the $\Left$-alternatives get priority.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	150	However this violates the POSIX rules.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	151	The reason for getting this undesired value
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	152	is that the new rule splits this regular expression up into
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	153	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	154	$a\cdot(b c + c) + ab \cdot (bc + c)$,
533 6acbc939af6a more Chengsong parents: 532 diff changeset	155	\end{center}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	156	which becomes a regular expression with a
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	157	totally different structure--the original
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	158	was a sequence, and now it becomes an alternative.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	159	With an alternative the maximum munch rule no longer works.\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	160	A method to reconcile this is to do the
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	161	transformation in \ref{desiredSimp} ``non-invasively'',
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	162	meaning that we traverse the list of regular expressions
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	163	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	164	$rs_a@[a]@rs_c$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	165	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	166	in the alternative
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	167	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	168	$\sum ( rs_a@[a]@rs_c)$
533 6acbc939af6a more Chengsong parents: 532 diff changeset	169	\end{center}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	170	using a function similar to $\distinctBy$,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	171	but this time
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	172	we allow a more general list rewrite:
592 7f4c353c0f6b more Chengsong parents: 591 diff changeset	173	\begin{mathpar}\label{cubicRule}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	174	\inferrule{\vspace{0mm} }{rs_a@[a]@rs_c
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	175	\stackrel{s}{\rightsquigarrow }
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	176	rs_a@[\textit{prune} \; a \; rs_a]@rs_c }
592 7f4c353c0f6b more Chengsong parents: 591 diff changeset	177	\end{mathpar}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	178	%L \; a_1' = L \; a_1 \setminus (\cup_{a \in rs_a} L \; a)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	179	where $\textit{prune} \;a \; acc$ traverses $a$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	180	without altering the structure of $a$, removing components in $a$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	181	that have appeared in the accumulator $acc$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	182	For example
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	183	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	184	$\textit{prune} \;\;\; (r_a+r_f+r_g+r_h)r_d \;\; \; [(r_a+r_b+r_c)r_d, (r_e+r_f)r_d] $
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	185	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	186	should be equal to
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	187	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	188	$(r_g+r_h)r_d$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	189	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	190	because $r_gr_d$ and
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	191	$r_hr_d$ are the only terms
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	192	that have not appeared in the accumulator list
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	193	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	194	$[(r_a+r_b+r_c)r_d, (r_e+r_f)r_d]$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	195	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	196	We implemented
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	197	function $\textit{prune}$ in Scala,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	198	and incorporated into our lexer,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	199	by replacing the $\simp$ function
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	200	with a stronger version called $\bsimpStrong$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	201	that prunes regular expressions.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	202	We call this lexer $\blexerStrong$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	203	$\blexerStrong$ is able to drastically reduce the
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	204	internal data structure size which could
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	205	trigger exponential behaviours in
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	206	$\blexersimp$.
590 988e92a70704 more chap5 and chap6 bsimp_idem Chengsong parents: 538 diff changeset	207	\begin{figure}[H]
533 6acbc939af6a more Chengsong parents: 532 diff changeset	208	\centering
6acbc939af6a more Chengsong parents: 532 diff changeset	209	\begin{tabular}{@{}c@{\hspace{0mm}}c@{\hspace{0mm}}c@{}}
6acbc939af6a more Chengsong parents: 532 diff changeset	210	\begin{tikzpicture}
6acbc939af6a more Chengsong parents: 532 diff changeset	211	\begin{axis}[
535 ce91c29d2885 fixed plotting error 6.1 Chengsong parents: 533 diff changeset	212	%xlabel={$n$},
ce91c29d2885 fixed plotting error 6.1 Chengsong parents: 533 diff changeset	213	myplotstyle,
ce91c29d2885 fixed plotting error 6.1 Chengsong parents: 533 diff changeset	214	xlabel={input length},
533 6acbc939af6a more Chengsong parents: 532 diff changeset	215	ylabel={size},
535 ce91c29d2885 fixed plotting error 6.1 Chengsong parents: 533 diff changeset	216	]
533 6acbc939af6a more Chengsong parents: 532 diff changeset	217	\addplot[red,mark=*, mark options={fill=white}] table {strongSimpCurve.data};
6acbc939af6a more Chengsong parents: 532 diff changeset	218	\end{axis}
6acbc939af6a more Chengsong parents: 532 diff changeset	219	\end{tikzpicture}
6acbc939af6a more Chengsong parents: 532 diff changeset	220	&
6acbc939af6a more Chengsong parents: 532 diff changeset	221	\begin{tikzpicture}
6acbc939af6a more Chengsong parents: 532 diff changeset	222	\begin{axis}[
535 ce91c29d2885 fixed plotting error 6.1 Chengsong parents: 533 diff changeset	223	%xlabel={$n$},
ce91c29d2885 fixed plotting error 6.1 Chengsong parents: 533 diff changeset	224	myplotstyle,
ce91c29d2885 fixed plotting error 6.1 Chengsong parents: 533 diff changeset	225	xlabel={input length},
533 6acbc939af6a more Chengsong parents: 532 diff changeset	226	ylabel={size},
535 ce91c29d2885 fixed plotting error 6.1 Chengsong parents: 533 diff changeset	227	]
533 6acbc939af6a more Chengsong parents: 532 diff changeset	228	\addplot[blue,mark=*, mark options={fill=white}] table {bsimpExponential.data};
6acbc939af6a more Chengsong parents: 532 diff changeset	229	\end{axis}
6acbc939af6a more Chengsong parents: 532 diff changeset	230	\end{tikzpicture}\\
535 ce91c29d2885 fixed plotting error 6.1 Chengsong parents: 533 diff changeset	231	\multicolumn{2}{l}{Graphs: Runtime for matching $((a^* + (aa)^* + \ldots + (aaaaa)^* )^)^$ with strings
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	232	of the form $\underbrace{aa..a}_{n}$.}
533 6acbc939af6a more Chengsong parents: 532 diff changeset	233	\end{tabular}
6acbc939af6a more Chengsong parents: 532 diff changeset	234	\caption{aaaaaStarStar} \label{fig:aaaaaStarStar}
6acbc939af6a more Chengsong parents: 532 diff changeset	235	\end{figure}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	236	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	237	\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	238	def atMostEmpty(r: Rexp) : Boolean = r match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	239	case ZERO => true
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	240	case ONE => true
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	241	case STAR(r) => atMostEmpty(r)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	242	case SEQ(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	243	case ALTS(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	244	case CHAR(_) => false
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	245	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	246
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	247
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	248	def isOne(r: Rexp) : Boolean = r match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	249	case ONE => true
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	250	case SEQ(r1, r2) => isOne(r1) && isOne(r2)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	251	case ALTS(r1, r2) => (isOne(r1) \|\| isOne(r2)) && (atMostEmpty(r1) && atMostEmpty(r2))//rs.forall(atMostEmpty) && rs.exists(isOne)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	252	case STAR(r0) => atMostEmpty(r0)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	253	case CHAR(c) => false
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	254	case ZERO => false
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	255	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	256
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	257	//r = r' ~ tail' : If tail' matches tail => returns r'
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	258	def removeSeqTail(r: Rexp, tail: Rexp) : Rexp = r match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	259	case SEQ(r1, r2) =>
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	260	if(r2 == tail)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	261	r1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	262	else
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	263	ZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	264	case r => ZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	265	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	266
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	267	def prune(r: ARexp, acc: Set[Rexp]) : ARexp = r match{
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	268	case AALTS(bs, rs) => rs.map(r => prune(r, acc)).filter(_ != ZERO) match
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	269	{
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	270	//all components have been removed, meaning this is effectively a duplicate
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	271	//flats will take care of removing this AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	272	case Nil => AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	273	case r::Nil => fuse(bs, r)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	274	case rs1 => AALTS(bs, rs1)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	275	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	276	case ASEQ(bs, r1, r2) =>
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	277	//remove the r2 in (ra + rb)r2 to identify the duplicate contents of r1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	278	prune(r1, acc.map(r => removeSeqTail(r, erase(r2)))) match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	279	//after pruning, returns 0
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	280	case AZERO => AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	281	//after pruning, got r1'.r2, where r1' is equal to 1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	282	case r1p if(isOne(erase(r1p))) => fuse(bs ++ mkepsBC(r1p), r2)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	283	//assemble the pruned head r1p with r2
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	284	case r1p => ASEQ(bs, r1p, r2)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	285	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	286	//this does the duplicate component removal task
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	287	case r => if(acc(erase(r))) AZERO else r
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	288	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	289	\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	290	\caption{pruning function together with its helper functions}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	291	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	292	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	293	The benefits of using
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	294	$\textit{prune}$ such as refining the finiteness bound
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	295	to a cubic bound has not been formalised yet.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	296	Therefore we choose to use Scala code rather than an Isabelle-style formal
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	297	definition like we did for $\simp$, as the definitions might change
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	298	to suit proof needs.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	299	In the rest of the chapter we will use this convention consistently.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	300	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	301	\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	302	def distinctWith(rs: List[ARexp],
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	303	pruneFunction: (ARexp, Set[Rexp]) => ARexp,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	304	acc: Set[Rexp] = Set()) : List[ARexp] =
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	305	rs match{
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	306	case Nil => Nil
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	307	case r :: rs =>
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	308	if(acc(erase(r)))
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	309	distinctWith(rs, pruneFunction, acc)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	310	else {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	311	val pruned_r = pruneFunction(r, acc)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	312	pruned_r ::
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	313	distinctWith(rs,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	314	pruneFunction,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	315	turnIntoTerms(erase(pruned_r)) ++: acc
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	316	)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	317	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	318	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	319	\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	320	\caption{A Stronger Version of $\textit{distinctBy}$}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	321	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	322	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	323	The function $\textit{prune}$ is used in $\distinctWith$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	324	$\distinctWith$ is a stronger version of $\distinctBy$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	325	which not only removes duplicates as $\distinctBy$ would
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	326	do, but also uses the $\textit{pruneFunction}$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	327	argument to prune away verbose components in a regular expression.\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	328	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	329	\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	330	//a stronger version of simp
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	331	def bsimpStrong(r: ARexp): ARexp =
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	332	{
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	333	r match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	334	case ASEQ(bs1, r1, r2) => (bsimpStrong(r1), bsimpStrong(r2)) match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	335	//normal clauses same as simp
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	336	case (AZERO, _) => AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	337	case (_, AZERO) => AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	338	case (AONE(bs2), r2s) => fuse(bs1 ++ bs2, r2s)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	339	//bs2 can be discarded
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	340	case (r1s, AONE(bs2)) => fuse(bs1, r1s) //assert bs2 == Nil
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	341	case (r1s, r2s) => ASEQ(bs1, r1s, r2s)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	342	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	343	case AALTS(bs1, rs) => {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	344	//distinctBy(flat_res, erase)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	345	distinctWith(flats(rs.map(bsimpStrong(_))), prune) match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	346	case Nil => AZERO
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	347	case s :: Nil => fuse(bs1, s)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	348	case rs => AALTS(bs1, rs)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	349	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	350	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	351	//stars that can be treated as 1
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	352	case ASTAR(bs, r0) if(atMostEmpty(erase(r0))) => AONE(bs)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	353	case r => r
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	354	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	355	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	356	\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	357	\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	358	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	359	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	360	$\distinctWith$, is in turn used in $\bsimpStrong$:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	361	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	362	\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	363	//Conjecture: [\| bdersStrong(s, r) \|] = O([\| r \|]^3)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	364	def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	365	case Nil => r
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	366	case c::s => bdersStrong(s, bsimpStrong(bder(c, r)))
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	367	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	368	\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	369	\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	370	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	371	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	372	We conjecture that the above Scala function $\bdersStrongs$,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	373	written $\bdersStrong{\_}{\_}$ as an infix notation,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	374	satisfies the following property:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	375	\begin{conjecture}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	376	$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	377	\end{conjecture}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	378	The stronger version of $\blexersimp$'s
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	379	code in Scala looks like:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	380	\begin{figure}[H]
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	381	\begin{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	382	def strongBlexer(r: Rexp, s: String) : Option[Val] = {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	383	Try(Some(decode(r, strong_blex_simp(internalise(r), s.toList)))).getOrElse(None)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	384	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	385	def strong_blex_simp(r: ARexp, s: List[Char]) : Bits = s match {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	386	case Nil => {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	387	if (bnullable(r)) {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	388	mkepsBC(r)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	389	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	390	else
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	391	throw new Exception("Not matched")
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	392	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	393	case c::cs => {
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	394	strong_blex_simp(strongBsimp(bder(c, r)), cs)
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	395	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	396	}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	397	\end{lstlisting}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	398	\end{figure}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	399	\noindent
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	400	We would like to preserve the correctness like the one
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	401	we had for $\blexersimp$:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	402	\begin{conjecture}\label{cubicConjecture}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	403	$\blexerStrong \;r \; s = \blexer\; r\;s$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	404	\end{conjecture}
592 7f4c353c0f6b more Chengsong parents: 591 diff changeset	405	\noindent
7f4c353c0f6b more Chengsong parents: 591 diff changeset	406	We introduce the new rule \ref{cubicRule}
7f4c353c0f6b more Chengsong parents: 591 diff changeset	407	and augment our proofs in chapter \ref{Bitcoded2}.
7f4c353c0f6b more Chengsong parents: 591 diff changeset	408	The idea is to maintain the properties like
7f4c353c0f6b more Chengsong parents: 591 diff changeset	409	$r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$ etc.
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	410
592 7f4c353c0f6b more Chengsong parents: 591 diff changeset	411	In the next section,
7f4c353c0f6b more Chengsong parents: 591 diff changeset	412	we will describe why we
7f4c353c0f6b more Chengsong parents: 591 diff changeset	413	believe a cubic bound can be achieved.
7f4c353c0f6b more Chengsong parents: 591 diff changeset	414	We give an introduction to the
7f4c353c0f6b more Chengsong parents: 591 diff changeset	415	partial derivatives,
7f4c353c0f6b more Chengsong parents: 591 diff changeset	416	which was invented by Antimirov \cite{Antimirov95},
7f4c353c0f6b more Chengsong parents: 591 diff changeset	417	and then link it with the result of the function
7f4c353c0f6b more Chengsong parents: 591 diff changeset	418	$\bdersStrongs$.
7f4c353c0f6b more Chengsong parents: 591 diff changeset	419
7f4c353c0f6b more Chengsong parents: 591 diff changeset	420	\section{Antimirov's partial derivatives}
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	421	This suggests a link towrads "partial derivatives"
592 7f4c353c0f6b more Chengsong parents: 591 diff changeset	422	introduced The idea behind Antimirov's partial derivatives
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	423	is to do derivatives in a similar way as suggested by Brzozowski,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	424	but maintain a set of regular expressions instead of a single one:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	425
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	426	%TODO: antimirov proposition 3.1, needs completion
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	427	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	428	\begin{tabular}{ccc}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	429	$\partial_x(a+b)$ & $=$ & $\partial_x(a) \cup \partial_x(b)$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	430	$\partial_x(\ONE)$ & $=$ & $\phi$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	431	\end{tabular}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	432	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	433
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	434	Rather than joining the calculated derivatives $\partial_x a$ and $\partial_x b$ together
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	435	using the alternatives constructor, Antimirov cleverly chose to put them into
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	436	a set instead. This breaks the terms in a derivative regular expression up,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	437	allowing us to understand what are the "atomic" components of it.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	438	For example, To compute what regular expression $x^(xx + y)^$'s
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	439	derivative against $x$ is made of, one can do a partial derivative
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	440	of it and get two singleton sets $\{x^* \cdot (xx + y)^\}$ and $\{x \cdot (xx + y) ^ \}$
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	441	from $\partial_x(x^) \cdot (xx + y) ^$ and $\partial_x((xx + y)^*)$.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	442	To get all the "atomic" components of a regular expression's possible derivatives,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	443	there is a procedure Antimirov called $\textit{lf}$, short for "linear forms", that takes
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	444	whatever character is available at the head of the string inside the language of a
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	445	regular expression, and gives back the character and the derivative regular expression
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	446	as a pair (which he called "monomial"):
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	447	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	448	\begin{tabular}{ccc}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	449	$\lf(\ONE)$ & $=$ & $\phi$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	450	$\lf(c)$ & $=$ & $\{(c, \ONE) \}$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	451	$\lf(a+b)$ & $=$ & $\lf(a) \cup \lf(b)$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	452	$\lf(r^)$ & $=$ & $\lf(r) \bigodot \lf(r^)$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	453	\end{tabular}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	454	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	455	%TODO: completion
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	456
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	457	There is a slight difference in the last three clauses compared
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	458	with $\partial$: instead of a dot operator $ \textit{rset} \cdot r$ that attaches the regular
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	459	expression $r$ with every element inside $\textit{rset}$ to create a set of
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	460	sequence derivatives, it uses the "circle dot" operator $\bigodot$ which operates
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	461	on a set of monomials (which Antimirov called "linear form") and a regular
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	462	expression, and returns a linear form:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	463	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	464	\begin{tabular}{ccc}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	465	$l \bigodot (\ZERO)$ & $=$ & $\phi$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	466	$l \bigodot (\ONE)$ & $=$ & $l$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	467	$\phi \bigodot t$ & $=$ & $\phi$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	468	$\{ (x, \ZERO) \} \bigodot t$ & $=$ & $\{(x,\ZERO) \}$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	469	$\{ (x, \ONE) \} \bigodot t$ & $=$ & $\{(x,t) \}$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	470	$\{ (x, p) \} \bigodot t$ & $=$ & $\{(x,p\cdot t) \}$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	471	$\lf(a+b)$ & $=$ & $\lf(a) \cup \lf(b)$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	472	$\lf(r^)$ & $=$ & $\lf(r) \cdot \lf(r^)$\\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	473	\end{tabular}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	474	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	475	%TODO: completion
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	476
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	477	Some degree of simplification is applied when doing $\bigodot$, for example,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	478	$l \bigodot (\ZERO) = \phi$ corresponds to $r \cdot \ZERO \rightsquigarrow \ZERO$,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	479	and $l \bigodot (\ONE) = l$ to $l \cdot \ONE \rightsquigarrow l$, and
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	480	$\{ (x, \ZERO) \} \bigodot t = \{(x,\ZERO) \}$ to $\ZERO \cdot x \rightsquigarrow \ZERO$,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	481	and so on.
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	482
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	483	With the function $\lf$ one can compute all possible partial derivatives $\partial_{UNIV}(r)$ of a regular expression $r$ with
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	484	an iterative procedure:
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	485	\begin{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	486	\begin{tabular}{llll}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	487	$\textit{while}$ & $(\Delta_i \neq \phi)$ & & \\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	488	& $\Delta_{i+1}$ & $ =$ & $\lf(\Delta_i) - \PD_i$ \\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	489	& $\PD_{i+1}$ & $ =$ & $\Delta_{i+1} \cup \PD_i$ \\
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	490	$\partial_{UNIV}(r)$ & $=$ & $\PD$ &
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	491	\end{tabular}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	492	\end{center}
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	493
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	494
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	495	$(r_1 + r_2) \cdot r_3 \longrightarrow (r_1 \cdot r_3) + (r_2 \cdot r_3)$,
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	496
b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	497
532 cc54ce075db5 restructured Chengsong parents: diff changeset	498
cc54ce075db5 restructured Chengsong parents: diff changeset	499
594 62f8fa03863e more Chengsong parents: 592 diff changeset	500	\section{The NTIMES Constructor, and the Size Bound and Correctness Proof for it}
592 7f4c353c0f6b more Chengsong parents: 591 diff changeset	501	The NTIMES construct has the following closed form:
7f4c353c0f6b more Chengsong parents: 591 diff changeset	502	\begin{verbatim}
7f4c353c0f6b more Chengsong parents: 591 diff changeset	503	"rders_simp (RNTIMES r0 (Suc n)) (c#s) =
7f4c353c0f6b more Chengsong parents: 591 diff changeset	504	rsimp ( RALTS ( (map (optermsimp r0 ) (nupdates s r0 [Some ([c], n)]) ) ))"
7f4c353c0f6b more Chengsong parents: 591 diff changeset	505	\end{verbatim}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	506	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	507	% SECTION 1
cc54ce075db5 restructured Chengsong parents: diff changeset	508	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	509
cc54ce075db5 restructured Chengsong parents: diff changeset	510	\section{Adding Support for the Negation Construct, and its Correctness Proof}
cc54ce075db5 restructured Chengsong parents: diff changeset	511	We now add support for the negation regular expression:
cc54ce075db5 restructured Chengsong parents: diff changeset	512	\[ r ::= \ZERO \mid \ONE
cc54ce075db5 restructured Chengsong parents: diff changeset	513	\mid c
cc54ce075db5 restructured Chengsong parents: diff changeset	514	\mid r_1 \cdot r_2
cc54ce075db5 restructured Chengsong parents: diff changeset	515	\mid r_1 + r_2
cc54ce075db5 restructured Chengsong parents: diff changeset	516	\mid r^*
cc54ce075db5 restructured Chengsong parents: diff changeset	517	\mid \sim r
cc54ce075db5 restructured Chengsong parents: diff changeset	518	\]
cc54ce075db5 restructured Chengsong parents: diff changeset	519	The $\textit{nullable}$ function's clause for it would be
cc54ce075db5 restructured Chengsong parents: diff changeset	520	\[
cc54ce075db5 restructured Chengsong parents: diff changeset	521	\textit{nullable}(~r) = \neg \nullable(r)
cc54ce075db5 restructured Chengsong parents: diff changeset	522	\]
cc54ce075db5 restructured Chengsong parents: diff changeset	523	The derivative would be
cc54ce075db5 restructured Chengsong parents: diff changeset	524	\[
cc54ce075db5 restructured Chengsong parents: diff changeset	525	~r \backslash c = ~ (r \backslash c)
cc54ce075db5 restructured Chengsong parents: diff changeset	526	\]
cc54ce075db5 restructured Chengsong parents: diff changeset	527
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	528	The most tricky part of lexing for the $~r$ regular expression
532 cc54ce075db5 restructured Chengsong parents: diff changeset	529	is creating a value for it.
cc54ce075db5 restructured Chengsong parents: diff changeset	530	For other regular expressions, the value aligns with the
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	531	structure of the regular expression:
532 cc54ce075db5 restructured Chengsong parents: diff changeset	532	\[
cc54ce075db5 restructured Chengsong parents: diff changeset	533	\vdash \Seq(\Char(a), \Char(b)) : a \cdot b
cc54ce075db5 restructured Chengsong parents: diff changeset	534	\]
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	535	But for the $~r$ regular expression, $s$ is a member of it if and only if
532 cc54ce075db5 restructured Chengsong parents: diff changeset	536	$s$ does not belong to $L(r)$.
cc54ce075db5 restructured Chengsong parents: diff changeset	537	That means when there
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	538	is a match for the not regular expression, it is not possible to generate how the string $s$ matched
532 cc54ce075db5 restructured Chengsong parents: diff changeset	539	with $r$.
cc54ce075db5 restructured Chengsong parents: diff changeset	540	What we can do is preserve the information of how $s$ was not matched by $r$,
cc54ce075db5 restructured Chengsong parents: diff changeset	541	and there are a number of options to do this.
cc54ce075db5 restructured Chengsong parents: diff changeset	542
591 b2d0de6aee18 more polishing integrated comments chap2 Chengsong parents: 590 diff changeset	543	We could give a partial value when there is a partial match for the regular expression inside
532 cc54ce075db5 restructured Chengsong parents: diff changeset	544	the $\mathbf{not}$ construct.
cc54ce075db5 restructured Chengsong parents: diff changeset	545	For example, the string $ab$ is not in the language of $(a\cdot b) \cdot c$,
cc54ce075db5 restructured Chengsong parents: diff changeset	546	A value for it could be
cc54ce075db5 restructured Chengsong parents: diff changeset	547	\[
cc54ce075db5 restructured Chengsong parents: diff changeset	548	\vdash \textit{Not}(\Seq(\Char(a), \Char(b))) : ~((a \cdot b ) \cdot c)
cc54ce075db5 restructured Chengsong parents: diff changeset	549	\]
cc54ce075db5 restructured Chengsong parents: diff changeset	550	The above example demonstrates what value to construct
cc54ce075db5 restructured Chengsong parents: diff changeset	551	when the string $s$ is at most a real prefix
cc54ce075db5 restructured Chengsong parents: diff changeset	552	of the strings in $L(r)$. When $s$ instead is not a prefix of any strings
cc54ce075db5 restructured Chengsong parents: diff changeset	553	in $L(r)$, it becomes unclear what to return as a value inside the $\textit{Not}$
cc54ce075db5 restructured Chengsong parents: diff changeset	554	constructor.
cc54ce075db5 restructured Chengsong parents: diff changeset	555
cc54ce075db5 restructured Chengsong parents: diff changeset	556	Another option would be to either store the string $s$ that resulted in
cc54ce075db5 restructured Chengsong parents: diff changeset	557	a mis-match for $r$ or a dummy value as a placeholder:
cc54ce075db5 restructured Chengsong parents: diff changeset	558	\[
533 6acbc939af6a more Chengsong parents: 532 diff changeset	559	\vdash \textit{Not}(abcd) : ~( r_1 )
532 cc54ce075db5 restructured Chengsong parents: diff changeset	560	\]
cc54ce075db5 restructured Chengsong parents: diff changeset	561	or
cc54ce075db5 restructured Chengsong parents: diff changeset	562	\[
533 6acbc939af6a more Chengsong parents: 532 diff changeset	563	\vdash \textit{Not}(\textit{Dummy}) : ~( r_1 )
532 cc54ce075db5 restructured Chengsong parents: diff changeset	564	\]
cc54ce075db5 restructured Chengsong parents: diff changeset	565	We choose to implement this as it is most straightforward:
cc54ce075db5 restructured Chengsong parents: diff changeset	566	\[
cc54ce075db5 restructured Chengsong parents: diff changeset	567	\mkeps(~(r)) = \textit{if}(\nullable(r)) \; \textit{Error} \; \textit{else} \; \textit{Not}(\textit{Dummy})
cc54ce075db5 restructured Chengsong parents: diff changeset	568	\]
cc54ce075db5 restructured Chengsong parents: diff changeset	569
cc54ce075db5 restructured Chengsong parents: diff changeset	570	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	571	% SECTION 2
cc54ce075db5 restructured Chengsong parents: diff changeset	572	%----------------------------------------------------------------------------------------
cc54ce075db5 restructured Chengsong parents: diff changeset	573
cc54ce075db5 restructured Chengsong parents: diff changeset	574	\section{Bounded Repetitions}
cc54ce075db5 restructured Chengsong parents: diff changeset	575
cc54ce075db5 restructured Chengsong parents: diff changeset	576

author	Chengsong
	Fri, 02 Sep 2022 19:18:50 +0100
changeset 594	62f8fa03863e
parent 592	7f4c353c0f6b
child 596	b306628a0eab
permissions	-rwxr-xr-x