lexing: ChengsongTanPhdThesis/Chapters/Bitcoded2.tex@0f00d440f484 (annotated)

532 cc54ce075db5 restructured Chengsong parents: diff changeset	1	% Chapter Template
cc54ce075db5 restructured Chengsong parents: diff changeset	2
cc54ce075db5 restructured Chengsong parents: diff changeset	3	% Main chapter title
cc54ce075db5 restructured Chengsong parents: diff changeset	4	\chapter{Correctness of Bit-coded Algorithm with Simplification}
cc54ce075db5 restructured Chengsong parents: diff changeset	5
cc54ce075db5 restructured Chengsong parents: diff changeset	6	\label{Bitcoded2} % Change X to a consecutive number; for referencing this chapter elsewhere, use \ref{ChapterX}
cc54ce075db5 restructured Chengsong parents: diff changeset	7	%Then we illustrate how the algorithm without bitcodes falls short for such aggressive
cc54ce075db5 restructured Chengsong parents: diff changeset	8	%simplifications and therefore introduce our version of the bitcoded algorithm and
cc54ce075db5 restructured Chengsong parents: diff changeset	9	%its correctness proof in
cc54ce075db5 restructured Chengsong parents: diff changeset	10	%Chapter 3\ref{Chapter3}.
cc54ce075db5 restructured Chengsong parents: diff changeset	11
cc54ce075db5 restructured Chengsong parents: diff changeset	12
cc54ce075db5 restructured Chengsong parents: diff changeset	13
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	14	Now we introduce the simplifications, which is why we introduce the
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	15	bitcodes in the first place.
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	16
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	17	\section{Simplification for Annotated Regular Expressions}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	18	The first thing we notice in the fast growth of examples such as $(a^a^)^*$'s
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	19	and $(a^* + (aa)^)^$'s derivatives is that a lot of duplicated sub-patterns
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	20	are scattered around different levels:
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	21
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	22	\begin{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	23	$(a^a^)^* \rightarrow (a^a^ + a^)\cdot(a^a^)^$\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	24	$((a^a^ + a^) + a^)\cdot(a^a^)^* + (a^a^ + a^)\cdot(a^a^)^$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	25	\end{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	26	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	27	Despite that we have already implemented the simple-minded simplifications
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	28	such as throwing away useless $\ONE$s and $\ZERO$s.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	29	The simplification rule $r + r \rightarrow $ cannot make a difference either
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	30	since it only removes duplicates on the same level, not something like
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	31	$(r+a)+r \rightarrow r+a$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	32	This requires us to break up nested alternatives into lists, for example
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	33	using the flatten operation similar to the one defined for any function language's
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	34	list datatype:
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	35
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	36	\begin{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	37	\begin{tabular}{@{}lcl@{}}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	38	$\textit{flts} \; (_{bs}\sum \textit{as}) :: \textit{as'}$ & $\dn$ & $(\textit{map} \;
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	39	(\textit{fuse}\;bs)\; \textit{as}) \; @ \; \textit{flts} \; as' $ \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	40	$\textit{flts} \; \ZERO :: as'$ & $\dn$ & $ \textit{flts} \; \textit{as'} $ \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	41	$\textit{flts} \; a :: as'$ & $\dn$ & $a :: \textit{flts} \; \textit{as'}$ \quad(otherwise)
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	42	\end{tabular}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	43	\end{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	44
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	45	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	46	There is a minor difference though, in that our $\flts$ operation defined by us
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	47	also throws away $\ZERO$s.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	48	For a flattened list of regular expressions, a de-duplication can be done efficiently:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	49
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	50
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	51	\begin{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	52	\begin{tabular}{@{}lcl@{}}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	53	$\distinctBy \; [] \; f\; acc $ & $ =$ & $ []$\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	54	$\distinctBy \; (x :: xs) \; f \; acc$ & $=$ & \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	55	& & $\quad \textit{if} (f \; x) \in acc \textit{then} \distinctBy \; xs \; f \; acc$\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	56	& & $\quad \textit{else} x :: (\distinctBy \; xs \; f \; (\{f \; x\} \cup acc))$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	57	\end{tabular}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	58	\end{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	59	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	60	The reason we define a distinct function under a mapping $f$ is because
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	61	we want to eliminate regular expressions that are the same syntactically,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	62	but having different bit-codes (more on the reason why we can do this later).
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	63	To achieve this, we call $\erase$ as the function $f$ during the distinction
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	64	operation.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	65	A recursive definition of our simplification function
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	66	that looks somewhat similar to our Scala code is given below:
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	67
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	68	\begin{center}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	69	\begin{tabular}{@{}lcl@{}}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	70
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	71	$\textit{bsimp} \; (_{bs}a_1\cdot a_2)$ & $\dn$ & $ \textit{bsimp}_{ASEQ} \; bs \;(\textit{bsimp} \; a_1) \; (\textit{bsimp} \; a_2) $ \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	72	$\textit{bsimp} \; (_{bs}\sum \textit{as})$ & $\dn$ & $\textit{bsimp}_{ALTS} \; \textit{bs} \; (\textit{distinctBy} \; ( \textit{flatten} ( \textit{map} \; bsimp \; as)) \; \erase \; \phi) $ \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	73	$\textit{bsimp} \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	74	\end{tabular}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	75	\end{center}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	76
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	77	\noindent
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	78	The simplification (named $\bsimp$ for \emph{b}it-coded) does a pattern matching on the regular expression.
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	79	When it detected that the regular expression is an alternative or
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	80	sequence, it will try to simplify its children regular expressions
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	81	recursively and then see if one of the children turns into $\ZERO$ or
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	82	$\ONE$, which might trigger further simplification at the current level.
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	83	Current level simplifications are handled by the function $\textit{bsimp}_{ASEQ}$,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	84	using rules such as $\ZERO \cdot r \rightarrow \ZERO$ and $\ONE \cdot r \rightarrow r$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	85	\begin{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	86	\begin{tabular}{@{}lcl@{}}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	87	$\textit{bsimp}_{ASEQ} \; bs\; a \; b$ & $\dn$ & $ (a,\; b) \textit{match}$\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	88	&&$\quad\textit{case} \; (\ZERO, \_) \Rightarrow \ZERO$ \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	89	&&$\quad\textit{case} \; (\_, \ZERO) \Rightarrow \ZERO$ \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	90	&&$\quad\textit{case} \; (_{bs1}\ONE, a_2') \Rightarrow \textit{fuse} \; (bs@bs_1) \; a_2'$ \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	91	&&$\quad\textit{case} \; (a_1', a_2') \Rightarrow _{bs}a_1' \cdot a_2'$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	92	\end{tabular}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	93	\end{center}
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	94	\noindent
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	95	The most involved part is the $\sum$ clause, where we first call $\flts$ on
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	96	the simplified children regular expression list $\textit{map}\; \textit{bsimp}\; \textit{as}$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	97	and then call $\distinctBy$ on that list, the predicate determining whether two
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	98	elements are the same is $\erase \; r_1 = \erase\; r_2$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	99	Finally, depending on whether the regular expression list $as'$ has turned into a
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	100	singleton or empty list after $\flts$ and $\distinctBy$, $\textit{bsimp}_{AALTS}$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	101	decides whether to keep the current level constructor $\sum$ as it is, and
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	102	removes it when there are less than two elements:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	103	\begin{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	104	\begin{tabular}{lcl}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	105	$\textit{bsimp}_{AALTS} \; bs \; as'$ & $ \dn$ & $ as' \; \textit{match}$\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	106	&&$\quad\textit{case} \; [] \Rightarrow \ZERO$ \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	107	&&$\quad\textit{case} \; a :: [] \Rightarrow \textit{fuse bs a}$ \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	108	&&$\quad\textit{case} \; as' \Rightarrow _{bs}\sum \textit{as'}$\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	109	\end{tabular}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	110
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	111	\end{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	112	Having defined the $\bsimp$ function,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	113	we add it as a phase after a derivative is taken,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	114	so it stays small:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	115	\begin{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	116	\begin{tabular}{lcl}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	117	$r \backslash_{bsimp} s$ & $\dn$ & $\textit{bsimp}(r \backslash s)$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	118	\end{tabular}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	119	\end{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	120	Following previous notation of natural
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	121	extension from derivative w.r.t.~character to derivative
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	122	w.r.t.~string, we define the derivative that nests simplifications with derivatives:%\comment{simp in the [] case?}
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	123
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	124	\begin{center}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	125	\begin{tabular}{lcl}
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	126	$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(r \backslash_{bsimp}\, c) \backslash_{bsimps}\, s$ \\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	127	$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	128	\end{tabular}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	129	\end{center}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	130
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	131	\noindent
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	132	Extracting bit-codes from the derivatives that had been simplified repeatedly after
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	133	each derivative run, the simplified $\blexer$, called $\blexersimp$,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	134	is defined as
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	135	\begin{center}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	136	\begin{tabular}{lcl}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	137	$\textit{blexer\_simp}\;r\,s$ & $\dn$ &
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	138	$\textit{let}\;a = (r^\uparrow)\backslash_{simp}\, s\;\textit{in}$\\
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	139	& & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	140	& & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	141	& & $\;\;\textit{else}\;\textit{None}$
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	142	\end{tabular}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	143	\end{center}
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	144
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	145	\noindent
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	146	This algorithm keeps the regular expression size small, for example,
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	147	with this simplification our previous $(a + aa)^*$ example's fast growth (over
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	148	$10^5$ nodes at around $20$ input length)
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	149	will be reduced to just 17 and stays constant, no matter how long the
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	150	input string is.
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	151	We show some graphs to better demonstrate this imrpovement.
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	152
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	153
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	154	\section{$(a+aa)^$ and $(a^\cdot a^)^$ against $\protect\underbrace{aa\ldots a}_\text{n \textit{a}s}$ After Simplification}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	155	For $(a+aa)^*$, it used to grow to over $9000$ nodes with simple-minded simplification
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	156	at only around $15$ input characters:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	157	\begin{center}
539 7cf9f17aa179 more Chengsong parents: 538 diff changeset	158	\begin{tabular}{ll}
7cf9f17aa179 more Chengsong parents: 538 diff changeset	159	\begin{tikzpicture}
7cf9f17aa179 more Chengsong parents: 538 diff changeset	160	\begin{axis}[
7cf9f17aa179 more Chengsong parents: 538 diff changeset	161	xlabel={$n$},
7cf9f17aa179 more Chengsong parents: 538 diff changeset	162	ylabel={derivative size},
7cf9f17aa179 more Chengsong parents: 538 diff changeset	163	width=7cm,
7cf9f17aa179 more Chengsong parents: 538 diff changeset	164	height=4cm,
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	165	legend entries={Lexer with $\textit{bsimp}$},
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	166	legend pos= south east,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	167	legend cell align=left]
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	168	\addplot[red,mark=*, mark options={fill=white}] table {a_aa_star_bsimp.data};
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	169	\end{axis}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	170	\end{tikzpicture} %\label{fig:BitcodedLexer}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	171	&
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	172	\begin{tikzpicture}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	173	\begin{axis}[
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	174	xlabel={$n$},
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	175	ylabel={derivative size},
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	176	width = 7cm,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	177	height = 4cm,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	178	legend entries={Lexer without $\textit{bsimp}$},
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	179	legend pos= north west,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	180	legend cell align=left]
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	181	\addplot[red,mark=*, mark options={fill=white}] table {a_aa_star_easysimp.data};
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	182	\end{axis}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	183	\end{tikzpicture}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	184	\end{tabular}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	185	\end{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	186	And for $(a^\cdot a^)^*$, unlike in \ref{fig:BetterWaterloo}, the size with simplification now stay nicely constant with current
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	187	simplification rules (we put the graphs together to show the contrast)
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	188	\begin{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	189	\begin{tabular}{ll}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	190	\begin{tikzpicture}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	191	\begin{axis}[
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	192	xlabel={$n$},
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	193	ylabel={derivative size},
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	194	width=7cm,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	195	height=4cm,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	196	legend entries={Lexer with $\textit{bsimp}$},
539 7cf9f17aa179 more Chengsong parents: 538 diff changeset	197	legend pos= south east,
7cf9f17aa179 more Chengsong parents: 538 diff changeset	198	legend cell align=left]
7cf9f17aa179 more Chengsong parents: 538 diff changeset	199	\addplot[red,mark=*, mark options={fill=white}] table {BitcodedLexer.data};
7cf9f17aa179 more Chengsong parents: 538 diff changeset	200	\end{axis}
7cf9f17aa179 more Chengsong parents: 538 diff changeset	201	\end{tikzpicture} %\label{fig:BitcodedLexer}
7cf9f17aa179 more Chengsong parents: 538 diff changeset	202	&
7cf9f17aa179 more Chengsong parents: 538 diff changeset	203	\begin{tikzpicture}
7cf9f17aa179 more Chengsong parents: 538 diff changeset	204	\begin{axis}[
7cf9f17aa179 more Chengsong parents: 538 diff changeset	205	xlabel={$n$},
7cf9f17aa179 more Chengsong parents: 538 diff changeset	206	ylabel={derivative size},
7cf9f17aa179 more Chengsong parents: 538 diff changeset	207	width = 7cm,
7cf9f17aa179 more Chengsong parents: 538 diff changeset	208	height = 4cm,
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	209	legend entries={Lexer without $\textit{bsimp}$},
539 7cf9f17aa179 more Chengsong parents: 538 diff changeset	210	legend pos= north west,
7cf9f17aa179 more Chengsong parents: 538 diff changeset	211	legend cell align=left]
7cf9f17aa179 more Chengsong parents: 538 diff changeset	212	\addplot[red,mark=*, mark options={fill=white}] table {BetterWaterloo.data};
7cf9f17aa179 more Chengsong parents: 538 diff changeset	213	\end{axis}
7cf9f17aa179 more Chengsong parents: 538 diff changeset	214	\end{tikzpicture}
7cf9f17aa179 more Chengsong parents: 538 diff changeset	215	\end{tabular}
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	216	\end{center}
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	217
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	218	%----------------------------------------------------------------------------------------
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	219	% SECTION rewrite relation
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	220	%----------------------------------------------------------------------------------------
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	221	\section{The Rewriting Relation $\rrewrite$($\rightsquigarrow$)}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	222	The overall idea for the correctness
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	223	\begin{conjecture}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	224	$\blexersimp \; r \; s = \blexer \; r\; s$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	225	\end{conjecture}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	226	is that the $\textit{bsimp}$ will not change the regular expressions so much that
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	227	it becomes impossible to extract the $\POSIX$ values.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	228	To capture this "similarity" between unsimplified regular expressions and simplified ones,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	229	we devise the rewriting relation $\rrewrite$, written infix as $\rightsquigarrow$:
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	230
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	231	\begin{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	232	\begin{mathpar}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	233	\inferrule{}{_{bs} \ZERO \cdot r_2 \rightsquigarrow \ZERO\\}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	234
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	235	\inferrule{}{_{bs} r_1 \cdot \ZERO \rightsquigarrow \ZERO\\}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	236
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	237	\inferrule{}{_{bs1} ((_{bs2} \ONE) \cdot r) \rightsquigarrow \fuse \; (bs_1 @ bs_2) \; r\\}\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	238
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	239
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	240
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	241	\inferrule{\\ r_1 \rightsquigarrow r_2}{_{bs} r_1 \cdot r_3 \rightsquigarrow _{bs} r_2 \cdot r_3\\}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	242
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	243	\inferrule{\\ r_3 \rightsquigarrow r_4}{_{bs} r_1 \cdot r_3 \rightsquigarrow _{bs} r_1 \cdot r_4\\}\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	244
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	245	\inferrule{}{ _{bs}\sum [] \rightsquigarrow \ZERO\\}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	246
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	247	\inferrule{}{ _{bs}\sum [a] \rightsquigarrow \fuse \; bs \; a\\}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	248
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	249	\inferrule{\\ rs_1 \stackrel{s}{\rightsquigarrow} rs_2}{_{bs}\sum rs_1 \rightsquigarrow rs_2\\}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	250
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	251	\inferrule{}{\\ [] \stackrel{s}{\rightsquigarrow} []}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	252
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	253	\inferrule{rs_1 \stackrel{s}{\rightsquigarrow} rs_2}{\\ r :: rs_1 \rightsquigarrow r :: rs_2 \rightsquigarrow}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	254
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	255	\inferrule{r_1 \rightsquigarrow r_2}{ r_1 :: rs \stackrel{s}{\rightsquigarrow} r_2 :: rs\\}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	256
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	257	\inferrule{}{\ZERO :: rs \stackrel{s}{\rightsquigarrow} rs}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	258
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	259	\inferrule{}{_{bs} \sum (rs_1 :: rs_b) \stackrel{s}{\rightsquigarrow} ((\map \; (\fuse \; bs_1) \; rs_1) @ rsb) }
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	260
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	261	\inferrule{\\ \rerase{a_1} = \rerase{a_2}}{rs_a @ [a_1] @ rs_b @ [a_2] @ rsc \stackrel{s}{\rightsquigarrow} rs_a @ [a_1] @ rs_b @ rs_c}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	262
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	263	\end{mathpar}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	264	\end{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	265	These "rewrite" steps define the atomic simplifications we could impose on regular expressions
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	266	under our simplification algorithm.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	267	For convenience, we define a relation $\stackrel{s}{\rightsquigarrow}$ for rewriting
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	268	a list of regular exression to another list.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	269	The $\rerase{}$ function is used instead of $_\downarrow$ for the finiteness bound proof of next chapter,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	270	which we will discuss later. For the moment the reader can assume they basically do the same thing as $\erase$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	271
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	272	Usually more than one steps are taking place during the simplification of a regular expression,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	273	therefore we define the reflexive transitive closure of the $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	274	relations:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	275
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	276	\begin{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	277	\begin{mathpar}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	278	\inferrule{}{ r \stackrel{*}{\rightsquigarrow} r \\}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	279	\inferrule{}{\\ rs \stackrel{s*}{\rightsquigarrow} rs \\}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	280	\inferrule{\\r_1 \stackrel{}{\rightsquigarrow} r_2 \land \; r_2 \stackrel{}{\rightsquigarrow} r_3}{r_1 \stackrel{*}{\rightsquigarrow} r_3\\}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	281	\inferrule{\\rs_1 \stackrel{}{\rightsquigarrow} rs_2 \land \; rs_2 \stackrel{}{\rightsquigarrow} rs_3}{rs_1 \stackrel{*}{\rightsquigarrow} rs_3}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	282	\end{mathpar}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	283	\end{center}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	284	Now that we have modelled the rewriting behaviour of our simplifier $\bsimp$, we prove mainly
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	285	three properties about how these relations connect to $\blexersimp$:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	286	\begin{itemize}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	287	\item
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	288	$r \stackrel{*}{\rightsquigarrow} \bsimp{r}$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	289	The algorithm $\bsimp$ only transforms the regex $r$ using steps specified by
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	290	$\stackrel{*}{\rightsquigarrow}$ and nothing else.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	291	\item
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	292	$r \rightsquigarrow r' \implies r \backslash c \rightsquigarrow r'\backslash c$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	293	The relation $\stackrel{*}{rightsquigarrow}$ is preserved under derivative.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	294	\item
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	295	$(r \stackrel{*}{\rightsquigarrow} r'\land \bnullable \; r_1) \implies \bmkeps \; r = \bmkeps \; r'$. If we reach another
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	296	expression in finitely many atomic simplification steps, then these two regular expressions
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	297	will produce the same bit-codes under the bit collection function $\bmkeps$ used by our $\blexer$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	298
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	299	\end{itemize}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	300	\section{Three Important properties}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	301	These properties would work together towards the correctness theorem.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	302	We start proving each of these lemmas below.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	303	\subsection{$(r \stackrel{}{\rightsquigarrow} r'\land \bnullable \; r_1) \implies \bmkeps \; r = \bmkeps \; r'$ and $r \stackrel{}{\rightsquigarrow} \bsimp{r}$}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	304	The first few properties we establish is that the inference rules we gave for $\rightsquigarrow$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	305	and $\stackrel{s}{\rightsquigarrow}$ also hold as implications for $\stackrel{*}{\rightsquigarrow}$ and
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	306	$\stackrel{s*}{\rightsquigarrow}$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	307	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	308	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies _{bs} \sum rs_1 \stackrel{}{\rightsquigarrow} _{bs} \sum rs_2$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	309	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	310	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	311	By rule induction of $\stackrel{s*}{\rightsquigarrow}$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	312	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	313	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	314	$r \stackrel{}{\rightsquigarrow} r' \implies _{bs} \sum r :: rs \stackrel{}{\rightsquigarrow} _{bs} \sum r' :: rs$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	315	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	316	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	317	By rule induction of $\stackrel{*}{\rightsquigarrow} $.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	318	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	319	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	320	Then we establish that the $\stackrel{s}{\rightsquigarrow}$ and $\stackrel{s*}{\rightsquigarrow}$ relation is preserved w.r.t appending and prepending of a list:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	321	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	322	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies rs @ rs_1 \stackrel{s}{\rightsquigarrow} rs @ rs_2$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	323	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	324	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	325	By induction on the list $rs$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	326	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	327
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	328	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	329	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies rs @ rs_1 \stackrel{s}{\rightsquigarrow} rs @ rs_2$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	330	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	331	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	332	By rule induction of $\stackrel{s*}{\rightsquigarrow}$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	333	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	334
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	335	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	336	The $\stackrel{s}{\rightsquigarrow} $ relation after appending a list becomes $\stackrel{s*}{\rightsquigarrow}$:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	337	\begin{lemma}\label{ssgqTossgs}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	338	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies rs_1 @ rs \stackrel{s*}{\rightsquigarrow} rs_2 @ rs$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	339	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	340	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	341	By rule induction of $\stackrel{s}{\rightsquigarrow}$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	342	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	343	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	344	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies rs_1 @ rs \stackrel{s}{\rightsquigarrow} rs_2 @ rs$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	345	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	346	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	347	By rule induction of $\stackrel{s*}{\rightsquigarrow}$ and using \ref{ssgqTossgs}.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	348	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	349	Here are two lemmas relating $\stackrel{}{\rightsquigarrow}$ and $\stackrel{s}{\rightsquigarrow}$:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	350	\begin{lemma}\label{singleton}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	351	$r_1 \stackrel{}{\rightsquigarrow} r_2 \implies [r_1] \stackrel{s}{\rightsquigarrow} [r_2]$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	352	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	353	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	354	By rule induction of $ \stackrel{*}{\rightsquigarrow} $.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	355	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	356	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	357	$rs_3 \stackrel{s}{\rightsquigarrow} rs_4 \land r_1 \stackrel{}{\rightsquigarrow} r_2 \implies
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	358	r_2 :: rs_3 \stackrel{s*}{\rightsquigarrow} r_2 :: rs_4$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	359	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	360	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	361	By using \ref{singleton}.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	362	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	363	Now we get to the "meaty" part of the proof, which relates the relations $\stackrel{s*}{\rightsquigarrow}$ and
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	364	$\stackrel{*}{\rightsquigarrow} $ with our simplification components such $\distinctBy$ and $\flts$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	365	The first lemma below says that for a list made of two parts $rs_1 @ rs_2$, one can throw away the duplicate
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	366	elements in $rs_2$, as well as those that have appeared in $rs_1$:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	367	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	368	$rs_1 @ rs_2 \stackrel{s*}{\rightsquigarrow} (rs_1 @ (\distinctBy \; rs_2 \; \; \rerase{\_}\; \; (\map\;\; \rerase{\_}\; \; rs_1)))$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	369	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	370	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	371	By induction on $rs_2$, where $rs_1$ is allowed to be arbitrary.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	372	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	373	The above h as the corollary that is suitable for the actual way $\distinctBy$ is called in $\bsimp$:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	374	\begin{lemma}\label{dBPreserves}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	375	$rs_ 1 \rightarrow \distinctBy \; rs_1 \; \phi$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	376	\end{lemma}
538 8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	377
8016a2480704 intro and chap2 Chengsong parents: 532 diff changeset	378
532 cc54ce075db5 restructured Chengsong parents: diff changeset	379
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	380	The flatten function $\flts$ works within the $\rightsquigarrow$ relation:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	381	\begin{lemma}\label{fltsPreserves}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	382	$rs \stackrel{s*}{\rightsquigarrow} \flts \; rs$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	383	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	384
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	385	The rewriting in many steps property is composible in terms of regular expression constructors:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	386	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	387	$r_1 \stackrel{}{\rightsquigarrow} r_2 \implies _{bs} r_1 \cdot r_3 \stackrel{}{\rightsquigarrow} \; _{bs} r_2 \cdot r_3 \quad $ and
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	388	$r_3 \stackrel{}{\rightsquigarrow} r_4 \implies _{bs} r_1 \cdot r_3 \stackrel{}{\rightsquigarrow} _{bs} \; r_1 \cdot r_4$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	389	\end{lemma}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	390
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	391	The rewriting in many steps properties $\stackrel{}{\rightsquigarrow}$ and $\stackrel{s}{\rightsquigarrow}$ is preserved under the function $\fuse$:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	392	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	393	$r_1 \stackrel{}{\rightsquigarrow} r_2 \implies \fuse \; bs \; r_1 \stackrel{}{\rightsquigarrow} \; \fuse \; bs \; r_2 \quad $ and
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	394	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies \map \; (\fuse \; bs) \; rs_1 \stackrel{s*}{\rightsquigarrow} \map \; (\fuse \; bs) \; rs_2$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	395	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	396	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	397	By the properties $r_1 \rightsquigarrow r_2 \implies \fuse \; bs \; r_1 \implies \fuse \; bs \; r_2$ and
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	398	$rs_2 \stackrel{s}{\rightsquigarrow} rs_3 \implies \map \; (\fuse \; bs) rs_2 \stackrel{s*}{\rightsquigarrow} \map \; (\fuse \; bs)\; rs_3$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	399	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	400	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	401	If we could rewrite a regular expression in many steps to $\ZERO$, then
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	402	we could also rewrite any sequence containing it to $\ZERO$:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	403	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	404	$r_1 \stackrel{}{\rightsquigarrow} \ZERO \implies _{bs}r_1\cdot r_2 \stackrel{}{\rightsquigarrow} \ZERO$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	405	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	406	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	407	$\bmkeps \; (r \backslash s) = \bmkeps \; \bderssimp{r}{s}$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	408	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	409	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	410	The function $\bsimpalts$ preserves rewritability:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	411	\begin{lemma}\label{bsimpaltsPreserves}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	412	$_{bs} \sum rs \stackrel{*}{\rightsquigarrow} \bsimpalts \; _{bs} \; rs$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	413	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	414	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	415	Before we give out the next lemmas, we define a predicate for a list of regular expressions
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	416	having at least one nullable regular expressions:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	417	\begin{definition}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	418	$\textit{bnullables} \; rs \dn \exists r \in rs. \bnullable r$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	419	\end{definition}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	420	The rewriting relation $\rightsquigarrow$ preserves nullability:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	421	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	422	$r_1 \rightsquigarrow r_2 \implies \bnullable \; r_1 = \bnullable \; r_2$ and
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	423	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies \textit{bnullables} \; rs_1 = \textit{bnullables} \; rs_2$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	424	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	425	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	426	By rule induction of $\rightarrow$ and $\stackrel{s}{\rightsquigarrow}$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	427	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	428	So does the many steps rewriting:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	429	\begin{lemma}\label{rewritesBnullable}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	430	$r_1 \stackrel{*}{\rightsquigarrow} r_2 \implies \bnullable \; r_1 = \bnullable \; r_2$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	431	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	432	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	433	By rule induction of $\stackrel{*}{\rightsquigarrow} $.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	434	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	435	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	436	And if both regular expressions in a rewriting relation are nullable, then they
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	437	produce the same bit-codes:
532 cc54ce075db5 restructured Chengsong parents: diff changeset	438
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	439	\begin{lemma}\label{rewriteBmkepsAux}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	440	$r_1 \rightsquigarrow r_2 \implies (\bnullable \; r_1 \land \bnullable \; r_2 \implies \bmkeps \; r_1 =
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	441	\bmkeps \; r_2)$ and
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	442	$rs_ 1 \stackrel{s}{\rightsquigarrow} rs_2 \implies (\bnullables \; rs_1 \land \bnullables \; rs_2 \implies
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	443	\bmkepss \; rs_1 = \bmkepss \; rs2)$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	444	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	445	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	446	The definition of $\bmkepss$ on list $rs$ is just to extract the bit-codes on the first element in $rs$ that
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	447	is $bnullable$:
532 cc54ce075db5 restructured Chengsong parents: diff changeset	448	\begin{center}
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	449	\begin{tabular}{lcl}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	450	$\bmkepss \; [] $ & $\dn$ & $[]$\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	451	$\bmkepss \; r :: rs$ & $\dn$ & $\textit{if} \; \bnullable \; r then \; (\bmkeps \; r) \; \textit{else} \; \bmkepss \; rs$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	452	\end{tabular}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	453	\end{center}
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	454	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	455	And now we are ready to prove the key property that if you
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	456	have two regular expressions, one rewritable in many steps to the other,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	457	and one of them is $\bnullable$, then they will both yield the same bits under $\bmkeps$:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	458	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	459	$(r \stackrel{*}{\rightsquigarrow} r'\land \bnullable \; r_1) \implies \bmkeps \; r = \bmkeps \; r'$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	460	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	461	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	462	By rule induction of $\stackrel{*}{\rightsquigarrow} $, using \ref{rewriteBmkepsAux} and $\ref{rewritesBnullable}$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	463	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	464	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	465	the other property is also ready:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	466	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	467	$r \stackrel{*}{\rightsquigarrow} \bsimp{r}$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	468	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	469	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	470	By an induction on $r$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	471	The most difficult case would be the alternative case, where we using properties such as \ref{bsimpaltsPreserves} and \ref{fltsPreserves} and \ref{dBPreserves}, we could continuously rewrite a list like:\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	472	$rs \stackrel{s*}{\rightsquigarrow} \map \; \textit{bsimp} \; rs$\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	473	$\ldots \stackrel{s*}{\rightsquigarrow} \flts \; (\map \; \textit{bsimp} \; rs)$\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	474	$\ldots \;\stackrel{s*}{\rightsquigarrow} \distinctBy \; (\flts \; (\map \; \textit{bsimp}\; rs)) \; \rerase \; \phi$\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	475	Then we could do the following regular expresssion many steps rewrite:\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	476	$ _{bs} \sum \distinctBy \; (\flts \; (\map \; \textit{bsimp}\; rs)) \; \rerase \; \phi \stackrel{*}{\rightsquigarrow} \bsimpalts \; bs \; (\distinctBy \; (\flts \; (\map \; \textit{bsimp}\; rs)) \; \rerase \; \phi)$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	477	\\
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	478
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	479	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	480	\section{Proof for the Property: $r_1 \stackrel{}{\rightsquigarrow} r_2 \implies r_1 \backslash c \stackrel{}{\rightsquigarrow} r_2 \backslash c$}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	481	The first thing we prove is that if we could rewrite in one step, then after derivative
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	482	we could rewrite in many steps:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	483	\begin{lemma}\label{rewriteBder}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	484	$r_1 \rightsquigarrow r_2 \implies r_1 \backslash c \stackrel{*}{\rightsquigarrow} r_2 \backslash c$ and
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	485	$rs_1 \stackrel{s}{\rightsquigarrow} rs_2 \implies \map \; (\_\backslash c) \; rs_1 \stackrel{s*}{\rightsquigarrow} \map \; (\_ \backslash c) \; rs_2$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	486	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	487	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	488	By induction on $\rightsquigarrow$ and $\stackrel{s}{\rightsquigarrow}$, using a number of the previous lemmas.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	489	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	490	Now we can prove that once we could rewrite from one expression to another in many steps,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	491	then after a derivative on both sides we could still rewrite one to another in many steps:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	492	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	493	$r_1 \stackrel{}{\rightsquigarrow} r_2 \implies r_1 \backslash c \stackrel{}{\rightsquigarrow} r_2 \backslash c$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	494	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	495	\begin{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	496	By rule induction of $\stackrel{*}{\rightsquigarrow} $ and using the previous lemma :\ref{rewriteBder}.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	497	\end{proof}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	498	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	499	This can be extended and combined with the previous two important properties
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	500	so that a regular expression's successivve derivatives can be rewritten in many steps
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	501	to its simplified counterpart:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	502	\begin{lemma}\label{bderBderssimp}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	503	$a \backslash s \stackrel{*}{\rightsquigarrow} \bderssimp{a}{s} $
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	504	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	505	\subsection{Main Theorem}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	506	Now with \ref{bdersBderssimp} we are ready for the main theorem.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	507	To link $\blexersimp$ and $\blexer$,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	508	we first say that they give out the same bits, if the lexing result is a match:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	509	\begin{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	510	$\bnullable \; (a \backslash s) \implies \bmkeps \; (a \backslash s) = \bmkeps \; (\bderssimp{a}{s})$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	511	\end{lemma}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	512	\noindent
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	513	Now that they give out the same bits, we know that they give the same value after decoding,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	514	which we know is correct value as $\blexer$ is correct:
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	515	\begin{theorem}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	516	$\blexer \; r \; s = \blexersimp{r}{s}$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	517	\end{theorem}
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	518	\noindent
532 cc54ce075db5 restructured Chengsong parents: diff changeset	519
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	520	\subsection{Comments on the Proof Techniques Used}
532 cc54ce075db5 restructured Chengsong parents: diff changeset	521	The non-trivial part of proving the correctness of the algorithm with simplification
cc54ce075db5 restructured Chengsong parents: diff changeset	522	compared with not having simplification is that we can no longer use the argument
cc54ce075db5 restructured Chengsong parents: diff changeset	523	in \cref{flex_retrieve}.
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	524	The function \retrieve needs the cumbersome structure of the (umsimplified)
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	525	annotated regular expression to
532 cc54ce075db5 restructured Chengsong parents: diff changeset	526	agree with the structure of the value, but simplification will always mess with the
543 b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	527	structure.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	528
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	529	We also tried to prove $\bsimp{\bderssimp{a}{s}} = \bsimp{a\backslash s}$,
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	530	but this turns out to be not true, A counterexample of this being
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	531	\[ r = [(1+c)\cdot [aa \cdot (1+c)]] \land s = aa
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	532	\]
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	533
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	534	Then we would have $\bsimp{a \backslash s}$ being
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	535	$_{[]}(_{ZZ}\ONE + _{ZS}c ) $
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	536	whereas $\bsimp{\bderssimp{a}{s}}$ would be
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	537	$_{Z}(_{Z} \ONE + _{S} c)$.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	538	Unfortunately if we apply $\textit{bsimp}$ at different
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	539	stages we will always have this discrepancy, due to
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	540	whether the $\map \; (\fuse\; bs) \; as$ operation in $\textit{bsimp}$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	541	is taken at some points will be entirely dependant on when the simplification
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	542	take place whether there is a larger alternative structure surrounding the
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	543	alternative being simplified.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	544	The good thing about $\stackrel{*}{\rightsquigarrow} $ is that it allows
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	545	us not specify how exactly the "atomic" simplification steps $\rightsquigarrow$
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	546	are taken, but simply say that they can be taken to make two similar
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	547	regular expressions equal, and can be done after interleaving derivatives
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	548	and simplifications.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	549
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	550
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	551	Having correctness property is good. But we would also like the lexer to be efficient in
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	552	some sense, for exampe, not grinding to a halt at certain cases.
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	553	In the next chapter we shall prove that for a given $r$, the internal derivative size is always
b2bea5968b89 thesis_thys Chengsong parents: 539 diff changeset	554	finitely bounded by a constant.

author	Chengsong
	Fri, 24 Jun 2022 21:49:23 +0100
changeset 553	0f00d440f484
parent 543	b2bea5968b89
child 576	3e1b699696b6
permissions	-rwxr-xr-x