ChengsongTanPhdThesis/Chapters/Finite.tex
changeset 554 15d182ffbc76
parent 553 0f00d440f484
child 555 aecf1ddf3541
equal deleted inserted replaced
553:0f00d440f484 554:15d182ffbc76
   134 $\rrexp$ give the exact correspondence between an annotated regular expression
   134 $\rrexp$ give the exact correspondence between an annotated regular expression
   135 and its (r-)erased version:
   135 and its (r-)erased version:
   136 \begin{lemma}
   136 \begin{lemma}
   137 $\rsize{\rerase a} = \asize a$
   137 $\rsize{\rerase a} = \asize a$
   138 \end{lemma}
   138 \end{lemma}
       
   139 \noindent
   139 This does not hold for plain $\rexp$s. 
   140 This does not hold for plain $\rexp$s. 
   140 
   141 
   141 Similarly we could define the derivative  and simplification on 
   142 Similarly we could define the derivative  and simplification on 
   142 $\rrexp$, which would be identical to those we defined for plain $\rexp$s in chapter1, 
   143 $\rrexp$, which would be identical to those we defined for plain $\rexp$s in chapter1, 
   143 except that now they can operate on alternatives taking multiple arguments.
   144 except that now they can operate on alternatives taking multiple arguments.
   195            			    &        & $\textit{else}\; r::\rdistinct{rs}{(rset \cup \{r\})}$
   196            			    &        & $\textit{else}\; r::\rdistinct{rs}{(rset \cup \{r\})}$
   196 \end{tabular}
   197 \end{tabular}
   197 \end{center}
   198 \end{center}
   198 %TODO: definition of rsimp (maybe only the alternative clause)
   199 %TODO: definition of rsimp (maybe only the alternative clause)
   199 \noindent
   200 \noindent
       
   201 The prefix $r$ in front of $\rdistinct{}{}$ is used mainly to 
       
   202 differentiate with $\textit{distinct}$, which is a built-in predicate
       
   203 in Isabelle that says all the elements of a list are unique.
   200 With $\textit{rdistinct}$ one can chain together all the other modules
   204 With $\textit{rdistinct}$ one can chain together all the other modules
   201 of $\bsimp{\_}$ (removing the functionalities related to bit-sequences)
   205 of $\bsimp{\_}$ (removing the functionalities related to bit-sequences)
   202 and get $\textit{rsimp}$ and $\rderssimp{\_}{\_}$.
   206 and get $\textit{rsimp}$ and $\rderssimp{\_}{\_}$.
   203 We omit these functions, as they are routine. Please refer to the formalisation
   207 We omit these functions, as they are routine. Please refer to the formalisation
   204 (in file BasicIdentities.thy) for the exact definition.
   208 (in file BasicIdentities.thy) for the exact definition.
   207 counterpart with the unlifted version of simplification and derivatives applied.
   211 counterpart with the unlifted version of simplification and derivatives applied.
   208 \begin{lemma}
   212 \begin{lemma}
   209 	The following equalities hold:
   213 	The following equalities hold:
   210 	\begin{itemize}
   214 	\begin{itemize}
   211 		\item
   215 		\item
   212 $\asize{\bsimp{a}} = \rsize{\rsimp{\rerase{a}}}$
   216 			$\asize{\bsimp{a}} = \rsize{\rsimp{\rerase{a}}}$
   213 \item
   217 		\item
   214 $\asize{\bderssimp{r}{s}} =  \rsize{\rderssimp{\rerase{r}}{s}}$
   218 			$\asize{\bderssimp{r}{s}} =  \rsize{\rderssimp{\rerase{r}}{s}}$
   215 \end{itemize}
   219 	\end{itemize}
   216 \end{lemma}
   220 \end{lemma}
   217 \noindent
   221 \noindent
   218 In the following content, we will focus on $\rrexp$'s size bound.
   222 In the following content, we will focus on $\rrexp$'s size bound.
   219 We will piece together this bound and show the same bound for annotated regular 
   223 We will piece together this bound and show the same bound for annotated regular 
   220 expressions in the end.
   224 expressions in the end.
   295 that are in the accumulator, therefore prepending a list $rs_a$ with an arbitrary
   299 that are in the accumulator, therefore prepending a list $rs_a$ with an arbitrary
   296 list $rs$ whose elements are all from the accumulator, and then call $\rdistinct{\_}{\_}$
   300 list $rs$ whose elements are all from the accumulator, and then call $\rdistinct{\_}{\_}$
   297 on the resulting list, the output will be as if we had called $\rdistinct{\_}{\_}$
   301 on the resulting list, the output will be as if we had called $\rdistinct{\_}{\_}$
   298 without the prepending of $rs$:
   302 without the prepending of $rs$:
   299 \begin{lemma}
   303 \begin{lemma}
   300 	If $rs \subseteq rset$, then 
   304 	The elements appearing in the accumulator will always be removed.
   301 	$\rdistinct{rs@rsa }{acc} = \rdistinct{rsa }{acc}$.
   305 	More precisely,
   302 \end{lemma}
   306 	\begin{itemize}
       
   307 		\item
       
   308 			If $rs \subseteq rset$, then 
       
   309 			$\rdistinct{rs@rsa }{acc} = \rdistinct{rsa }{acc}$.
       
   310 		\item
       
   311 			Furthermore, if $a \in rset$ and $\rdistinct{rs}{\{a\}} = []$,
       
   312 			then $\rdistinct{(rs @ rs')}{rset} = \rdistinct{rs'}{rset}$
       
   313 	\end{itemize}
       
   314 \end{lemma}
       
   315 
   303 \begin{proof}
   316 \begin{proof}
   304 	By induction on $rs$.
   317 	By induction on $rs$.
   305 \end{proof}
   318 \end{proof}
   306 \noindent
   319 \noindent
   307 On the other hand, if an element $r$ does not appear in the input list waiting to be deduplicated,
   320 On the other hand, if an element $r$ does not appear in the input list waiting to be deduplicated,
   324 The next property gives the condition for
   337 The next property gives the condition for
   325 when $\rdistinct{\_}{\_}$ becomes an identical mapping
   338 when $\rdistinct{\_}{\_}$ becomes an identical mapping
   326 for any prefix of an input list, in other words, when can 
   339 for any prefix of an input list, in other words, when can 
   327 we ``push out" the arguments of $\rdistinct{\_}{\_}$:
   340 we ``push out" the arguments of $\rdistinct{\_}{\_}$:
   328 \begin{lemma}
   341 \begin{lemma}
   329 	If $rs_1$'s elements are all unique, and not appearing in the accumulator $acc$, 
   342 	If $\textit{isDistinct} \; rs_1$, and $rs_1 \cap acc = \varnothing$,
   330 	then it can be taken out and left untouched in the output:
   343 	then it can be taken out and left untouched in the output:
   331 	\[\textit{rdistinct}\;  (rs_1 @ rsa)\;\, acc
   344 	\[\textit{rdistinct}\;  (rs_1 @ rsa)\;\, acc
   332 	= rs_1@(\textit{rdistinct} rsa \; (acc \cup rs_1))\]
   345 	= rs_1@(\textit{rdistinct} rsa \; (acc \cup rs_1))\]
   333 \end{lemma}
   346 \end{lemma}
   334 
   347 \noindent
       
   348 The predicate $\textit{isDistinct}$ is for testing
       
   349 whether a list's elements are all unique. It is defined
       
   350 recursively on the structure of a regular expression,
       
   351 and we omit the precise definition here.
   335 \begin{proof}
   352 \begin{proof}
   336 By an induction on $rs_1$, where $rsa$ and $acc$ are allowed to be arbitrary.
   353 By an induction on $rs_1$, where $rsa$ and $acc$ are allowed to be arbitrary.
   337 \end{proof}
   354 \end{proof}
       
   355 \noindent
       
   356 $\rdistinct{}{}$ removes any element in anywhere of a list, if it
       
   357 had appeared previously:
       
   358 \begin{lemma}\label{distinctRemovesMiddle}
       
   359 	The two properties hold if $r \in rs$:
       
   360 	\begin{itemize}
       
   361 		\item
       
   362 			$\rdistinct{rs}{rset} = \rdistinct{(rs @ [r])}{rset}$
       
   363 			and 
       
   364 			$\rdistinct{(ab :: rs @ [ab])}{rset'} = \rdistinct{(ab :: rs)}{rset'}$
       
   365 		\item
       
   366 			$\rdistinct{ (rs @ rs') }{rset} = \rdistinct{rs @ [r] @ rs'}{rset}$
       
   367 			and
       
   368 			$\rdistinct{(ab :: rs @ [ab] @ rs'')}{rset'} = 
       
   369 			 \rdistinct{(ab :: rs @ rs'')}{rset'}$
       
   370 	\end{itemize}
       
   371 \end{lemma}
       
   372 \noindent
       
   373 \begin{proof}
       
   374 By induction on $rs$. All other variables are allowed to be arbitrary.
       
   375 The second half of the lemma requires the first half.
       
   376 Note that for each half's two sub-propositions need to be proven concurrently,
       
   377 so that the induction goes through.
       
   378 \end{proof}
       
   379 
   338 \subsubsection{The Properties of $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS}$} 
   380 \subsubsection{The Properties of $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS}$} 
   339 We give in this subsection some properties of how $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS} $ interact with each other and with $@$, the concatenation operator.
   381 We give in this subsection some properties of how $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS} $ interact with each other and with $@$, the concatenation operator.
   340 These will be helpful in later closed form proofs, when
   382 These will be helpful in later closed form proofs, when
   341 we want to transform the ways in which multiple functions involving
   383 we want to transform the ways in which multiple functions involving
   342 those are composed together
   384 those are composed together
   343 in interleaving derivative and  simplification steps.
   385 in interleaving derivative and  simplification steps.
   344 
   386 
   345 When the function $\textit{Rflts}$ 
   387 When the function $\textit{Rflts}$ 
   346 is applied to the concatenation of two lists, the output can be calculated by first applying the
   388 is applied to the concatenation of two lists, the output can be calculated by first applying the
   347 functions on two lists separately, and then concatenating them together.
   389 functions on two lists separately, and then concatenating them together.
   348 \begin{lemma}
   390 \begin{lemma}\label{rfltsProps}
   349 	The function $\rflts$ has the below properties:\\
   391 	The function $\rflts$ has the below properties:\\
   350 	\begin{itemize}
   392 	\begin{itemize}
   351 		\item
   393 		\item
   352 	$\rflts \; (rs_1 @ rs_2) = \rflts \; rs_1 @ \rflts \; rs_2$
   394 			$\rflts \; (rs_1 @ rs_2) = \rflts \; rs_1 @ \rflts \; rs_2$
   353 \item
   395 		\item
   354 	If $r \neq \RZERO$ and $\nexists rs_1. r = \RALTS{rs}_1$, then $\rflts \; (r::rs) = r :: \rflts \; rs$
   396 			If $r \neq \RZERO$ and $\nexists rs_1. r = \RALTS{rs}_1$, then $\rflts \; (r::rs) = r :: \rflts \; rs$
       
   397 		\item
       
   398 			$\rflts \; (rs @ [\RZERO]) = \rflts \; rs$
       
   399 		\item
       
   400 			$\rflts \; (rs' @ [\RALTS{rs}]) = \rflts \; rs'@rs$
       
   401 		\item
       
   402 			$\rflts \; (rs @ [\RONE]) = \rflts \; rs @ [\RONE]$
       
   403 		\item
       
   404 			If $r \neq \RZERO$ and $\nexists rs'. r = \RALTS{rs'}$ then $\rflts \; (rs @ [r])
       
   405 			= (\rflts \; rs) @ [r]$
   355 	\end{itemize}
   406 	\end{itemize}
   356 \end{lemma}
   407 \end{lemma}
   357 \noindent
   408 \noindent
   358 \begin{proof}
   409 \begin{proof}
   359 	By induction on $rs_1$.
   410 	By induction on $rs_1$ in the first part, and induction on $r$ in the second part,
   360 \end{proof}
   411 	and induction on $rs$, $rs'$ in the third and fourth sub-lemma.
   361 
   412 \end{proof}
   362 
   413 \subsubsection{The $RL$ Function: Language Interpretation of $\textit{Rrexp}$s}
       
   414 Much like the definition of $L$ on plain regular expressions, one could also 
       
   415 define the language interpretation of $\rrexp$s.
       
   416 \begin{center}
       
   417 \begin{tabular}{lcl}
       
   418 $RL \; (\ZERO)$ & $\dn$ & $\phi$\\
       
   419 $RL \; (\ONE)$ & $\dn$ & $\{[]\}$\\
       
   420 $RL \; (c)$ & $\dn$ & $\{[c]\}$\\
       
   421 $RL \; \sum rs$ & $\dn$ & $ \bigcup_{r \in rs} (RL \; r)$\\
       
   422 $RL \; (r_1 \cdot r_2)$ & $\dn$ & $ RL \; (r_1) @ RL \; (r_2)$\\
       
   423 $RL \; (r^*)$ & $\dn$ & $ (RL(r))^*$
       
   424 \end{tabular}
       
   425 \end{center}
       
   426 \noindent
       
   427 The main use of $RL$ is to establish some connections between $\rsimp{}$ 
       
   428 and $\rnullable{}$:
       
   429 \begin{lemma}
       
   430 	The following properties hold:
       
   431 	\begin{itemize}
       
   432 		\item
       
   433 			If $\rnullable{r}$, then $\rsimp{r} \neq \RZERO$.
       
   434 		\item
       
   435 			$\rnullable{r \backslash s} \quad $ if and only if $\quad \rnullable{\rderssimp{r}{s}}$.
       
   436 	\end{itemize}
       
   437 \end{lemma}
       
   438 \begin{proof}
       
   439 	The first part is by induction on $r$. 
       
   440 	The second part is true because property 
       
   441 	\[ RL \; r = RL \; (\rsimp{r})\] holds.
       
   442 \end{proof}
       
   443 	
       
   444 \subsubsection{$\rsimp{}$ is Non-Increasing}
       
   445 In this subsection, we prove that the function $\rsimp{}$ does not 
       
   446 make the $\llbracket \rrbracket_r$ size increase.
       
   447 
       
   448 
       
   449 \begin{lemma}\label{rsimpSize}
       
   450 	$\llbracket \rsimp{r} \rrbracket_r \leq \llbracket r \rrbracket_r$
       
   451 \end{lemma}
       
   452 \subsubsection{Simplified $\textit{Rrexp}$s are Good}
       
   453 We formalise the notion of ``good" regular expressions,
       
   454 which means regular expressions that
       
   455 are not fully simplified. For alternative regular expressions that means they
       
   456 do not contain any nested alternatives like 
       
   457 \[ r_1 + (r_2 + r_3) \], un-removed $\RZERO$s like \[\RZERO + r\]
       
   458 or duplicate elements in a children regular expression list like \[ \sum [r, r, \ldots]\]:
       
   459 \begin{center}
       
   460 	\begin{tabular}{@{}lcl@{}}
       
   461 		$\good\; \RZERO$ & $\dn$ & $\textit{false}$\\
       
   462 		$\good\; \RONE$ & $\dn$ & $\textit{true}$\\
       
   463 		$\good\; \RCHAR{c}$ & $\dn$ & $\btrue$\\
       
   464 		$\good\; \RALTS{[]}$ & $\dn$ & $\bfalse$\\
       
   465 		$\good\; \RALTS{[r]}$ & $\dn$ & $\bfalse$\\
       
   466 		$\good\; \RALTS{r_1 :: r_2 :: rs}$ & $\dn$ & 
       
   467 		$\textit{isDistinct} \; (r_1 :: r_2 :: rs) \;$\\
       
   468 		& & $\textit{and}\; (\forall r' \in (r_1 :: r_2 :: rs).\; \good \; r'\; \,  \textit{and}\; \, \textit{nonAlt}\; r')$\\
       
   469 		$\good \; \RSEQ{\RZERO}{r}$ & $\dn$ & $\bfalse$\\
       
   470 		$\good \; \RSEQ{\RONE}{r}$ & $\dn$ & $\bfalse$\\
       
   471 		$\good \; \RSEQ{r}{\RZERO}$ & $\dn$ & $\bfalse$\\
       
   472 		$\good \; \RSEQ{r_1}{r_2}$ & $\dn$ & $\good \; r_1 \;\, \textit{and} \;\, \good \; r_2$\\
       
   473 		$\good \; \RSTAR{r}$ & $\dn$ & $\btrue$\\
       
   474 	\end{tabular}
       
   475 \end{center}
       
   476 \noindent
       
   477 The predicate $\textit{nonAlt}$ evaluates to true when the regular expression is not an
       
   478 alternative, and false otherwise.
       
   479 The $\good$ property is preserved under $\rsimp_{ALTS}$, provided that
       
   480 its non-empty argument list of expressions are all good themsleves, and $\textit{nonAlt}$, 
       
   481 and unique:
       
   482 \begin{lemma}\label{rsimpaltsGood}
       
   483 	If $rs \neq []$ and forall $r \in rs. \textit{nonAlt} \; r$ and $\textit{isDistinct} \; rs$,
       
   484 	then $\good \; (\rsimpalts \; rs)$ if and only if forall $r \in rs. \; \good \; r$.
       
   485 \end{lemma}
       
   486 \noindent
       
   487 We also note that
       
   488 if a regular expression $r$ is good, then $\rflts$ on the singleton
       
   489 list $[r]$ will not break goodness:
       
   490 \begin{lemma}\label{flts2}
       
   491 	If $\good \; r$, then forall $r' \in \rflts \; [r]. \; \good \; r'$ and $\textit{nonAlt} \; r'$.
       
   492 \end{lemma}
       
   493 \begin{proof}
       
   494 	By an induction on $r$.
       
   495 \end{proof}
       
   496 \noindent
       
   497 The other observation we make about $\rsimp{r}$ is that it never
       
   498 comes with nested alternatives, which we describe as the $\nonnested$
       
   499 property:
       
   500 \begin{center}
       
   501 	\begin{tabular}{lcl}
       
   502 		$\nonnested \; \, \sum []$ & $\dn$ & $\btrue$\\
       
   503 		$\nonnested \; \, \sum ((\sum rs_1) :: rs_2)$ & $\dn$ & $\bfalse$\\
       
   504 		$\nonnested \; \, \sum (r :: rs)$ & $\dn$ & $\nonnested (\sum rs)$\\
       
   505 		$\nonnested \; \, r $ & $\dn$ & $\btrue$
       
   506 	\end{tabular}
       
   507 \end{center}
       
   508 \noindent
       
   509 The $\rflts$ function
       
   510 always opens up nested alternatives,
       
   511 which enables $\rsimp$ to be non-nested:
       
   512 
       
   513 \begin{lemma}\label{nonnestedRsimp}
       
   514 	$\nonnested \; (\rsimp{r})$
       
   515 \end{lemma}
       
   516 \begin{proof}
       
   517 	By an induction on $r$.
       
   518 \end{proof}
       
   519 \noindent
       
   520 With this we could prove that a regular expressions
       
   521 after simplification and flattening and de-duplication,
       
   522 will not contain any alternative regular expression directly:
       
   523 \begin{lemma}\label{nonaltFltsRd}
       
   524 	If $x \in \rdistinct{\rflts\; (\map \; \rsimp{} \; rs)}{\varnothing}$ 
       
   525 	then $\textit{nonAlt} \; x$.
       
   526 \end{lemma}
       
   527 \begin{proof}
       
   528 	By \ref{nonnestedRsimp}.
       
   529 \end{proof}
       
   530 \noindent
       
   531 The other thing we know is that once $\rsimp{}$ had finished
       
   532 processing an alternative regular expression, it will not
       
   533 contain any $\RZERO$s, this is because all the recursive 
       
   534 calls to the simplification on the children regular expressions
       
   535 make the children good, and $\rflts$ will not take out
       
   536 any $\RZERO$s out of a good regular expression list,
       
   537 and $\rdistinct{}$ will not mess with the result.
       
   538 \begin{lemma}\label{flts3Obv}
       
   539 	The following are true:
       
   540 	\begin{itemize}
       
   541 		\item
       
   542 			If for all $r \in rs. \, \good \; r $ or $r = \RZERO$,
       
   543 			then for all $r \in \rflts\; rs. \, \good \; r$.
       
   544 		\item
       
   545 			If $x \in \rdistinct{\rflts\; (\map \; rsimp{}\; rs)}{\varnothing}$
       
   546 			and for all $y$ such that $\llbracket y \rrbracket_r$ less than
       
   547 			$\llbracket rs \rrbracket_r + 1$, either
       
   548 			$\good \; (\rsimp{y})$ or $\rsimp{y} = \RZERO$,
       
   549 			then $\good \; x$.
       
   550 	\end{itemize}
       
   551 \end{lemma}
       
   552 \begin{proof}
       
   553 	The first part is by induction on $rs$, where the induction
       
   554 	rule is the inductive cases for $\rflts$.
       
   555 	The second part is a corollary from the first part.
       
   556 \end{proof}
       
   557 
       
   558 And this leads to good structural property of $\rsimp{}$,
       
   559 that after simplification, a regular expression is
       
   560 either good or $\RZERO$:
       
   561 \begin{lemma}\label{good1}
       
   562 	For any r-regular expression $r$, $\good \; \rsimp{r}$ or $\rsimp{r} = \RZERO$.
       
   563 \end{lemma}
       
   564 \begin{proof}
       
   565 	By an induction on $r$. The inductive measure is the size $\llbracket \rrbracket_r$.
       
   566 	Lemma \ref{rsimpSize} says that 
       
   567 	$\llbracket \rsimp{r}\rrbracket_r$ is smaller than or equal to
       
   568 	$\llbracket r \rrbracket_r$.
       
   569 	Therefore, in the $r_1 \cdot r_2$ and $\sum rs$ case,
       
   570 	Inductive hypothesis applies to the children regular expressions
       
   571 	$r_1$, $r_2$, etc. The lemma \ref{flts3Obv}'s precondition is satisfied
       
   572 	by that as well.
       
   573 	The lemmas \ref{nonnestedRsimp} and  \ref{nonaltFltsRd} are used
       
   574 	to ensure that goodness is preserved at the topmost level.
       
   575 \end{proof}
       
   576 We shall prove that any good regular expression is 
       
   577 a fixed-point for $\rsimp{}$.
       
   578 First we prove an auxiliary lemma:
       
   579 \begin{lemma}\label{goodaltsNonalt}
       
   580 	If $\good \; \sum rs$, then $\rflts\; rs = rs$.
       
   581 \end{lemma}
       
   582 \begin{proof}
       
   583 	By an induction on $\sum rs$. The inductive rules are the cases
       
   584 	for $\good$.
       
   585 \end{proof}
       
   586 \noindent
       
   587 Now we are ready to prove that good regular expressions are invariant
       
   588 of $\rsimp{}$ application:
       
   589 \begin{lemma}\label{test}
       
   590 	If $\good \;r$ then $\rsimp{r} = r$.
       
   591 \end{lemma}
       
   592 \begin{proof}
       
   593 	By an induction on the inductive cases of $\good$.
       
   594 	The lemma \ref{goodaltsNonalt} is used in the alternative
       
   595 	case where 2 or more elements are present in the list.
       
   596 \end{proof}
       
   597 
       
   598 \subsubsection{$\rsimp$ is Idempotent}
       
   599 The idempotency of $\rsimp$ is very useful in 
       
   600 manipulating regular expression terms into desired
       
   601 forms so that key steps allowing further rewriting to closed forms
       
   602 are possible.
       
   603 \begin{lemma}\label{rsimpIdem}
       
   604 $\rsimp{r} = \rsimp{\rsimp{r}}$
       
   605 \end{lemma}
       
   606 
       
   607 \begin{proof}
       
   608 	By \ref{test} and \ref{good1}.
       
   609 \end{proof}
       
   610 \noindent
       
   611 This property means we do not have to repeatedly
       
   612 apply simplification in each step, which justifies
       
   613 our definition of $\blexersimp$.
       
   614 
       
   615 
       
   616 On the other hand, we could repeat the same $\rsimp{}$ applications
       
   617 on regular expressions as many times as we want, if we have at least
       
   618 one simplification applied to it, and apply it wherever we would like to:
       
   619 \begin{corollary}\label{headOneMoreSimp}
       
   620 	$\map \; \rsimp{(r :: rs)} = \map \; \rsimp{} \; (\rsimp{r} :: rs)$
       
   621 \end{corollary}
       
   622 \noindent
       
   623 This will be useful in later closed form proof's rewriting steps.
       
   624 Similarly, we point out the following useful facts below:
       
   625 \begin{lemma}
       
   626 	The following equalities hold if $r = \rsimp{r'}$ for some $r'$:
       
   627 	\begin{itemize}
       
   628 		\item
       
   629 			If $r  = \sum rs$ then $\rsimpalts \; rs = \sum rs$.
       
   630 		\item
       
   631 			If $r = \sum rs$ then $\rdistinct{rs}{\varnothing} = rs$.
       
   632 		\item
       
   633 			$\rsimpalts \; (\rdistinct{\rflts \; [r]}{\varnothing}) = r$.
       
   634 	\end{itemize}
       
   635 \end{lemma}
       
   636 \begin{proof}
       
   637 	By application of \ref{rsimpIdem} and \ref{good1}.
       
   638 \end{proof}
       
   639 
       
   640 \noindent
       
   641 With the idempotency of $\rsimp{}$ and its corollaries, 
       
   642 we can start proving some key equalities leading to the 
       
   643 closed forms.
       
   644 Now presented are a few equivalent terms under $\rsimp{}$.
       
   645 We use $r_1 \sequal r_2 $ here to denote $\rsimp{r_1} = \rsimp{r_2}$.
       
   646 \begin{lemma}
       
   647 \begin{itemize}
       
   648 	\item
       
   649 		$\rsimpalts \; (\RZERO :: rs) \sequal \rsimpalts\; rs$
       
   650 	\item
       
   651 		$\rsimpalts \; rs \sequal \rsimpalts (\map \; \rsimp{} \; rs)$
       
   652 	\item
       
   653 		$\RALTS{\RALTS{rs}} \sequal \RALTS{rs}$
       
   654 \end{itemize}
       
   655 \end{lemma}
       
   656 \noindent
       
   657 We need more equalities like the above to enable a closed form,
       
   658 but to proceed we need to introduce two rewrite relations,
       
   659 to make things smoother.
       
   660 \subsubsection{The rewrite relation $\hrewrite$ and $\grewrite$}
       
   661 Insired by the success we had in the correctness proof 
       
   662 in \ref{Bitcoded2}, where we invented
       
   663 a term rewriting system to capture the similarity between terms
       
   664 and prove equivalence, we follow suit here defining simplification
       
   665 steps as rewriting steps.
       
   666 The presentation will be more concise than that in \ref{Bitcoded2}.
       
   667 To differentiate between the rewriting steps for annotated regular expressions
       
   668 and $\rrexp$s, we add characters $h$ and $g$ below the squig arrow symbol
       
   669 to mean atomic simplification transitions 
       
   670 of $\rrexp$s and $\rrexp$ lists, respectively.
       
   671 
       
   672 \begin{center}
       
   673 
       
   674 List of 1-step rewrite rules for regular expressions simplification without bitcodes:
       
   675 \begin{figure}
       
   676 \caption{the "h-rewrite" rules}
       
   677 \[
       
   678 r_1 \cdot \ZERO \rightarrow_h \ZERO \]
       
   679 
       
   680 \[
       
   681 \ZERO \cdot r_2 \rightarrow \ZERO 
       
   682 \]
       
   683 \end{figure}
       
   684 And we define an "grewrite" relation that works on lists:
       
   685 \begin{center}
       
   686 \begin{tabular}{lcl}
       
   687 $ \ZERO :: rs$ & $\rightarrow_g$ & $rs$
       
   688 \end{tabular}
       
   689 \end{center}
       
   690 
       
   691 
       
   692 
       
   693 With these relations we prove
       
   694 \begin{lemma}
       
   695 $rs \rightarrow rs'  \implies \RALTS{rs} \rightarrow \RALTS{rs'}$
       
   696 \end{lemma}
       
   697 which enables us to prove "closed forms" equalities such as
       
   698 \begin{lemma}
       
   699 $\sflat{(r_1 \cdot r_2) \backslash s} = \RALTS{ (r_1 \backslash s) \cdot r_2 :: (\map (r_2 \backslash \_) (\suffix \; s \; r_1 ))}$
       
   700 \end{lemma}
       
   701 
       
   702 But the most involved part of the above lemma is proving the following:
       
   703 \begin{lemma}
       
   704 $\bsimp{\RALTS{rs @ \RALTS{rs_1} @ rs'}} = \bsimp{\RALTS{rs @rs_1 @ rs'}} $ 
       
   705 \end{lemma}
       
   706 which is needed in proving things like 
       
   707 \begin{lemma}
       
   708 $r \rightarrow_f r'  \implies \rsimp{r} \rightarrow \rsimp{r'}$
       
   709 \end{lemma}
       
   710 
       
   711 Fortunately this is provable by induction on the list $rs$
   363 
   712 
   364 \subsection{A Closed Form for the Sequence Regular Expression}
   713 \subsection{A Closed Form for the Sequence Regular Expression}
   365 \begin{quote}\it
   714 \begin{quote}\it
   366 	Claim: For regular expressions $r_1 \cdot r_2$, we claim that
   715 	Claim: For regular expressions $r_1 \cdot r_2$, we claim that
   367 	\begin{center}
   716 	\begin{center}
   377 make the process of proving that a breeze.
   726 make the process of proving that a breeze.
   378 
   727 
   379 We define rewriting relations for $\rrexp$s, which allows us to do the 
   728 We define rewriting relations for $\rrexp$s, which allows us to do the 
   380 same trick as we did for the correctness proof,
   729 same trick as we did for the correctness proof,
   381 but this time we will have stronger equalities established.
   730 but this time we will have stronger equalities established.
   382 \subsection{"hrewrite" relation}
       
   383 List of 1-step rewrite rules for regular expressions simplification without bitcodes:
       
   384 \begin{figure}
       
   385 \caption{the "h-rewrite" rules}
       
   386 \[
       
   387 r_1 \cdot \ZERO \rightarrow_h \ZERO \]
       
   388 
       
   389 \[
       
   390 \ZERO \cdot r_2 \rightarrow \ZERO 
       
   391 \]
       
   392 \end{figure}
       
   393 And we define an "grewrite" relation that works on lists:
       
   394 \begin{center}
       
   395 \begin{tabular}{lcl}
       
   396 $ \ZERO :: rs$ & $\rightarrow_g$ & $rs$
       
   397 \end{tabular}
       
   398 \end{center}
       
   399 
       
   400 
       
   401 
       
   402 With these relations we prove
       
   403 \begin{lemma}
       
   404 $rs \rightarrow rs'  \implies \RALTS{rs} \rightarrow \RALTS{rs'}$
       
   405 \end{lemma}
       
   406 which enables us to prove "closed forms" equalities such as
       
   407 \begin{lemma}
       
   408 $\sflat{(r_1 \cdot r_2) \backslash s} = \RALTS{ (r_1 \backslash s) \cdot r_2 :: (\map (r_2 \backslash \_) (\suffix \; s \; r_1 ))}$
       
   409 \end{lemma}
       
   410 
       
   411 But the most involved part of the above lemma is proving the following:
       
   412 \begin{lemma}
       
   413 $\bsimp{\RALTS{rs @ \RALTS{rs_1} @ rs'}} = \bsimp{\RALTS{rs @rs_1 @ rs'}} $ 
       
   414 \end{lemma}
       
   415 which is needed in proving things like 
       
   416 \begin{lemma}
       
   417 $r \rightarrow_f r'  \implies \rsimp{r} \rightarrow \rsimp{r'}$
       
   418 \end{lemma}
       
   419 
       
   420 Fortunately this is provable by induction on the list $rs$
       
   421 
   731 
   422 
   732 
   423 \section{Estimating the Closed Forms' sizes}
   733 \section{Estimating the Closed Forms' sizes}
   424 
   734 
   425 		The closed form $f\; (g\; (\sum rs)) $ is controlled
   735 		The closed form $f\; (g\; (\sum rs)) $ is controlled
   785 
  1095 
   786 With this we can also add simplifications to both sides and get
  1096 With this we can also add simplifications to both sides and get
   787 \begin{lemma}
  1097 \begin{lemma}
   788 $\rsimp{\sflat{(r_1 \cdot r_2) \backslash s} }= \rsimp{\AALTS{[[]}{ (r_1 \backslash s) \cdot r_2 :: (\map (r_2 \backslash \_) (\vsuf{s}{r_1}))}}$
  1098 $\rsimp{\sflat{(r_1 \cdot r_2) \backslash s} }= \rsimp{\AALTS{[[]}{ (r_1 \backslash s) \cdot r_2 :: (\map (r_2 \backslash \_) (\vsuf{s}{r_1}))}}$
   789 \end{lemma}
  1099 \end{lemma}
   790 Together with the idempotency property of $\rsimp$,
  1100 Together with the idempotency property of $\rsimp{}$\ref{rsimpIdem},
   791 %TODO: state the idempotency property of rsimp
  1101 %TODO: state the idempotency property of rsimp
   792 \begin{lemma}
       
   793 $\rsimp{r} = \rsimp{\rsimp{r}}$
       
   794 \end{lemma}
       
   795 it is possible to convert the above lemma to obtain a "closed form"
  1102 it is possible to convert the above lemma to obtain a "closed form"
   796 for  derivatives nested with simplification:
  1103 for  derivatives nested with simplification:
   797 \begin{lemma}
  1104 \begin{lemma}
   798 $\rderssimp{(r_1 \cdot r_2)}{s} = \rsimp{\AALTS{[[]}{ (r_1 \backslash s) \cdot r_2 :: (\map (r_2 \backslash \_) (\vsuf{s}{r_1}))}}$
  1105 $\rderssimp{(r_1 \cdot r_2)}{s} = \rsimp{\AALTS{[[]}{ (r_1 \backslash s) \cdot r_2 :: (\map (r_2 \backslash \_) (\vsuf{s}{r_1}))}}$
   799 \end{lemma}
  1106 \end{lemma}
  1086 
  1393 
  1087 %----------------------------------------------------------------------------------------
  1394 %----------------------------------------------------------------------------------------
  1088 %	SECTION 1
  1395 %	SECTION 1
  1089 %----------------------------------------------------------------------------------------
  1396 %----------------------------------------------------------------------------------------
  1090 
  1397 
  1091 \section{Idempotency of $\simp$}
       
  1092 
       
  1093 \begin{equation}
       
  1094 	\simp \;r = \simp\; \simp \; r 
       
  1095 \end{equation}
       
  1096 This property means we do not have to repeatedly
       
  1097 apply simplification in each step, which justifies
       
  1098 our definition of $\blexersimp$.
       
  1099 It will also be useful in future proofs where properties such as 
       
  1100 closed forms are needed.
       
  1101 The proof is by structural induction on $r$.
       
  1102 
  1398 
  1103 %-----------------------------------
  1399 %-----------------------------------
  1104 %	SUBSECTION 1
  1400 %	SUBSECTION 1
  1105 %-----------------------------------
  1401 %-----------------------------------
  1106 \subsection{Syntactic Equivalence Under $\simp$}
  1402 \subsection{Syntactic Equivalence Under $\simp$}