diff -r 15d182ffbc76 -r aecf1ddf3541 ChengsongTanPhdThesis/Chapters/Finite.tex --- a/ChengsongTanPhdThesis/Chapters/Finite.tex Sun Jun 26 22:22:47 2022 +0100 +++ b/ChengsongTanPhdThesis/Chapters/Finite.tex Tue Jun 28 21:07:42 2022 +0100 @@ -276,7 +276,7 @@ remove duplicates in an \emph{r}$\textit{rexp}$ list, according to the accumulator and leave only one of each different element in a list: -\begin{lemma} +\begin{lemma}\label{rdistinctDoesTheJob} The function $\textit{rdistinct}$ satisfies the following properties: \begin{itemize} @@ -284,12 +284,19 @@ If $a \in acc$ then $a \notin (\rdistinct{rs}{acc})$. \item If list $rs'$ is the result of $\rdistinct{rs}{acc}$, - All the elements in $rs'$ are unique. + then $\textit{isDistinct} \; rs'$. + \item + $\rdistinct{rs}{acc} = rs - acc$ \end{itemize} \end{lemma} +\noindent +The predicate $\textit{isDistinct}$ is for testing +whether a list's elements are all unique. It is defined +recursively on the structure of a regular expression, +and we omit the precise definition here. \begin{proof} The first part is by an induction on $rs$. - The second part can be proven by using the + The second and third part can be proven by using the induction rules of $\rdistinct{\_}{\_}$. \end{proof} @@ -338,17 +345,14 @@ when $\rdistinct{\_}{\_}$ becomes an identical mapping for any prefix of an input list, in other words, when can we ``push out" the arguments of $\rdistinct{\_}{\_}$: -\begin{lemma} +\begin{lemma}\label{distinctRdistinctAppend} If $\textit{isDistinct} \; rs_1$, and $rs_1 \cap acc = \varnothing$, - then it can be taken out and left untouched in the output: + then \[\textit{rdistinct}\; (rs_1 @ rsa)\;\, acc = rs_1@(\textit{rdistinct} rsa \; (acc \cup rs_1))\] \end{lemma} \noindent -The predicate $\textit{isDistinct}$ is for testing -whether a list's elements are all unique. It is defined -recursively on the structure of a regular expression, -and we omit the precise definition here. +In other words, it can be taken out and left untouched in the output. \begin{proof} By an induction on $rs_1$, where $rsa$ and $acc$ are allowed to be arbitrary. \end{proof} @@ -359,12 +363,12 @@ The two properties hold if $r \in rs$: \begin{itemize} \item - $\rdistinct{rs}{rset} = \rdistinct{(rs @ [r])}{rset}$ - and + $\rdistinct{rs}{rset} = \rdistinct{(rs @ [r])}{rset}$\\ + and\\ $\rdistinct{(ab :: rs @ [ab])}{rset'} = \rdistinct{(ab :: rs)}{rset'}$ \item - $\rdistinct{ (rs @ rs') }{rset} = \rdistinct{rs @ [r] @ rs'}{rset}$ - and + $\rdistinct{ (rs @ rs') }{rset} = \rdistinct{rs @ [r] @ rs'}{rset}$\\ + and\\ $\rdistinct{(ab :: rs @ [ab] @ rs'')}{rset'} = \rdistinct{(ab :: rs @ rs'')}{rset'}$ \end{itemize} @@ -377,6 +381,33 @@ so that the induction goes through. \end{proof} +\noindent +This allows us to prove ``Idempotency" of $\rdistinct{}{}$ of some kind: +\begin{lemma}\label{rdistinctConcatGeneral} + The following equalities involving multiple applications of $\rdistinct{}{}$ hold: + \begin{itemize} + \item + $\rdistinct{(rs @ rs')}{\varnothing} = \rdistinct{((\rdistinct{rs}{\varnothing})@ rs')}{\varnothing}$ + \item + $\rdistinct{(rs @ rs')}{\varnothing} = \rdistinct{(\rdistinct{rs}{\varnothing} @ rs')}{\varnothing}$ + \item + If $rset' \subseteq rset$, then $\rdistinct{rs}{rset} = + \rdistinct{(\rdistinct{rs}{rset'})}{rset}$. As a corollary + of this, + \item + $\rdistinct{(rs @ rs')}{rset} = \rdistinct{ + (\rdistinct{rs}{\varnothing}) @ rs')}{rset}$. This + gives another corollary use later: + \item + If $a \in rset$, then $\rdistinct{(rs @ rs')}{rset} = \rdistinct{ + (\rdistinct{(a :: rs)}{\varnothing} @ rs')}{rset} $, + + \end{itemize} +\end{lemma} +\begin{proof} + By \ref{rdistinctDoesTheJob} and \ref{distinctRemovesMiddle}. +\end{proof} + \subsubsection{The Properties of $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS}$} We give in this subsection some properties of how $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS} $ interact with each other and with $@$, the concatenation operator. These will be helpful in later closed form proofs, when @@ -403,13 +434,20 @@ \item If $r \neq \RZERO$ and $\nexists rs'. r = \RALTS{rs'}$ then $\rflts \; (rs @ [r]) = (\rflts \; rs) @ [r]$ + \item + If $r = \RALTS{rs}$ and $r \in rs'$ then for all $r_1 \in rs. + r_1 \in \rflts \; rs'$. + \item + $\rflts \; (rs_a @ \RZERO :: rs_b) = \rflts \; (rs_a @ rs_b)$ \end{itemize} \end{lemma} \noindent \begin{proof} - By induction on $rs_1$ in the first part, and induction on $r$ in the second part, - and induction on $rs$, $rs'$ in the third and fourth sub-lemma. + By induction on $rs_1$ in the first sub-lemma, and induction on $r$ in the second part, + and induction on $rs$, $rs'$, $rs$, $rs'$, $rs_a$ in the third, fourth, fifth, sixth and + last sub-lemma. \end{proof} + \subsubsection{The $RL$ Function: Language Interpretation of $\textit{Rrexp}$s} Much like the definition of $L$ on plain regular expressions, one could also define the language interpretation of $\rrexp$s. @@ -594,7 +632,30 @@ The lemma \ref{goodaltsNonalt} is used in the alternative case where 2 or more elements are present in the list. \end{proof} +\noindent +Given below is a property involving $\rflts$, $\rdistinct{}{}$, $\rsimp{}$ and $\rsimp_{ALTS}$, +which requires $\ref{good1}$ to go through smoothly. +It says that an application of $\rsimp_{ALTS}$ can be "absorbed", +if it its output is concatenated with a list and then applied to $\rflts$. +\begin{lemma}\label{flattenRsimpalts} + $\rflts \; ( (\rsimp_{ALTS} \; + (\rdistinct{(\rflts \; (\map \; \rsimp{}\; rs))}{\varnothing})) :: + \map \; \rsimp{} \; rs' ) = + \rflts \; ( (\rdistinct{(\rflts \; (\map \; \rsimp{}\; rs))}{\varnothing}) @ ( + \map \; \rsimp{rs'}))$ + +\end{lemma} +\begin{proof} + By \ref{good1}. +\end{proof} +\noindent + + + + + +We are also \subsubsection{$\rsimp$ is Idempotent} The idempotency of $\rsimp$ is very useful in manipulating regular expression terms into desired @@ -617,7 +678,14 @@ on regular expressions as many times as we want, if we have at least one simplification applied to it, and apply it wherever we would like to: \begin{corollary}\label{headOneMoreSimp} - $\map \; \rsimp{(r :: rs)} = \map \; \rsimp{} \; (\rsimp{r} :: rs)$ + The following properties hold, directly from \ref{rsimpIdem}: + + \begin{itemize} + \item + $\map \; \rsimp{(r :: rs)} = \map \; \rsimp{} \; (\rsimp{r} :: rs)$ + \item + $\rsimp{(\RALTS{rs})} = \rsimp{(\RALTS{\map \; \rsimp{} \; rs})}$ + \end{itemize} \end{corollary} \noindent This will be useful in later closed form proof's rewriting steps. @@ -645,42 +713,152 @@ We use $r_1 \sequal r_2 $ here to denote $\rsimp{r_1} = \rsimp{r_2}$. \begin{lemma} \begin{itemize} + The following equivalence hold: \item $\rsimpalts \; (\RZERO :: rs) \sequal \rsimpalts\; rs$ \item $\rsimpalts \; rs \sequal \rsimpalts (\map \; \rsimp{} \; rs)$ \item $\RALTS{\RALTS{rs}} \sequal \RALTS{rs}$ + \item + $\sum ((\sum rs_a) :: rs_b) \sequal \sum rs_a @ rs_b$ + \item + $\RALTS{rs} = \RALTS{\map \; \rsimp{} \; rs}$ \end{itemize} \end{lemma} +\begin{proof} + By induction on the lists involved. +\end{proof} +\noindent +Similarly, +we introduce the equality for $\sum$ when certain child regular expressions +are $\sum$ themselves: +\begin{lemma}\label{simpFlatten3} + One can flatten the inside $\sum$ of a $\sum$ if it is being + simplified. Concretely, + \begin{itemize} + \item + If for all $r \in rs, rs', rs''$, we have $\good \; r $ + or $r = \RZERO$, then $\sum (rs' @ rs @ rs'') \sequal + \sum (rs' @ [\sum rs] @ rs'')$ holds. As a corollary, + \item + $\sum (rs' @ [\sum rs] @ rs'') \sequal \sum (rs' @ rs @ rs'')$ + \end{itemize} +\end{lemma} +\begin{proof} + By rewriting steps involving the use of \ref{test} and \ref{rdistinctConcatGeneral}. + The second sub-lemma is a corollary of the previous. +\end{proof} +%Rewriting steps not put in--too long and complicated------------------------------- +\begin{comment} + \begin{center} + $\rsimp{\sum (rs' @ rs @ rs'')} \stackrel{def of bsimp}{=}$ \\ + $\rsimpalts \; (\rdistinct{\rflts \; ((\map \; \rsimp{}\; rs') @ (\map \; \rsimp{} \; rs ) @ (\map \; \rsimp{} \; rs''))}{\varnothing})$ \\ + $\stackrel{by \ref{test}}{=} + \rsimpalts \; (\rdistinct{(\rflts \; rs' @ \rflts \; rs @ \rflts \; rs'')}{ + \varnothing})$\\ + $\stackrel{by \ref{rdistinctConcatGeneral}}{=} + \rsimpalts \; (\rdistinct{\rflts \; rs'}{\varnothing} @ \rdistinct{( + \rflts\; rs @ \rflts \; rs'')}{\rflts \; rs'})$\\ + + \end{center} +\end{comment} +%Rewriting steps not put in--too long and complicated------------------------------- \noindent We need more equalities like the above to enable a closed form, but to proceed we need to introduce two rewrite relations, to make things smoother. -\subsubsection{The rewrite relation $\hrewrite$ and $\grewrite$} +\subsubsection{The rewrite relation $\hrewrite$, $\frewrite$ and $\grewrite$} Insired by the success we had in the correctness proof in \ref{Bitcoded2}, where we invented -a term rewriting system to capture the similarity between terms -and prove equivalence, we follow suit here defining simplification -steps as rewriting steps. +a term rewriting system to capture the similarity between terms, +we follow suit here defining simplification +steps as rewriting steps. This allows capturing +similarities between terms that would be otherwise +hard to express. + +We use $\hrewrite$ for one-step atomic rewrite of regular expression simplification, +$\frewrite$ for rewrite of list of regular expressions that +include all operations carried out in $\rflts$, and $\grewrite$ for +rewriting a list of regular expressions possible in both $\rflts$ and $\rdistinct{}{}$. +Their reflexive transitive closures are used to denote zero or many steps, +as was the case in the previous chapter. The presentation will be more concise than that in \ref{Bitcoded2}. To differentiate between the rewriting steps for annotated regular expressions and $\rrexp$s, we add characters $h$ and $g$ below the squig arrow symbol to mean atomic simplification transitions of $\rrexp$s and $\rrexp$ lists, respectively. + + + List of one-step rewrite rules for $\rrexp$ ($\hrewrite$): + + \begin{center} +\begin{mathpar} + \inferrule[RSEQ0L]{}{\RZERO \cdot r_2 \hrewrite \RZERO\\} + + \inferrule[RSEQ0R]{}{r_1 \cdot \RZERO \hrewrite \RZERO\\} + + \inferrule[RSEQ1]{}{(\RONE \cdot r) \hrewrite r\\}\\ + + \inferrule[RSEQL]{ r_1 \hrewrite r_2}{r_1 \cdot r_3 \hrewrite r_2 \cdot r_3\\} + + \inferrule[RSEQR]{ r_3 \hrewrite r_4}{r_1 \cdot r_3 \hrewrite r_1 \cdot r_4\\}\\ + + \inferrule[RALTSChild]{r \hrewrite r'}{\sum (rs_1 @ [r] @ rs_2) \hrewrite \sum (rs_1 @ [r'] @ rs_2)\\} + + \inferrule[RALTS0]{}{\sum (rs_a @ [\RZERO] @ rs_b) \hrewrite \sum (rs_a @ rs_b)} + + \inferrule[RALTSNested]{}{\sum (rs_a @ [\sum rs_1] @ rs_b) \hrewrite \sum (rs_a @ rs_1 @ rs_b)} + + \inferrule[RALTSNil]{}{ \sum [] \hrewrite \RZERO\\} + + \inferrule[RALTSSingle]{}{ \sum [r] \hrewrite r\\} + + \inferrule[RALTSDelete]{\\ r_1 = r_2}{\sum rs_a @ [r_1] @ rs_b @ [r_2] @ rsc \hrewrite \sum rs_a @ [r_1] @ rs_b @ rs_c} + +\end{mathpar} +\end{center} -List of 1-step rewrite rules for regular expressions simplification without bitcodes: -\begin{figure} -\caption{the "h-rewrite" rules} -\[ -r_1 \cdot \ZERO \rightarrow_h \ZERO \] +%frewrite + List of one-step rewrite rules for flattening + a list of regular expressions($\frewrite$): +\begin{center} +\begin{mathpar} + \inferrule{}{\RZERO :: rs \frewrite rs \\} + + \inferrule{}{(\sum rs) :: rs_a \frewrite rs @ rs_a \\} + + \inferrule{rs_1 \frewrite rs_2}{r :: rs_1 \frewrite r :: rs_2} +\end{mathpar} +\end{center} + + Lists of one-step rewrite rules for flattening and de-duplicating + a list of regular expressions ($\grewrite$): +\begin{center} +\begin{mathpar} + \inferrule{}{\RZERO :: rs \frewrite rs \\} -\[ -\ZERO \cdot r_2 \rightarrow \ZERO -\] -\end{figure} + \inferrule{}{(\sum rs) :: rs_a \frewrite rs @ rs_a \\} + + \inferrule{rs_1 \frewrite rs_2}{r :: rs_1 \frewrite r :: rs_2} + + \inferrule[dB]{}{rs_a @ [a] @ rs_b @[a] @ rs_c \grewrite rs_a @ [a] @ rsb @ rsc} +\end{mathpar} +\end{center} + +\noindent +The reason why we take the trouble of defining +two separate list rewriting definitions $\frewrite$ and $\grewrite$ +is that sometimes $\grewrites$ is slightly too powerful +that it renders certain equivalence to break after derivative: + +$\rsimp{(\rsimpalts \; (\map \; (\_ \backslash x) \; (\rdistinct{(\rflts \; (\map \; (\rsimp{} \; \circ \; (\lambda r. \rderssimp{r}{xs}))))}{\varnothing})))} \neq + \rsimp{(\rsimpalts \; (\rdistinct{(\map \; (\_ \backslash x) \; (\rflts \; (\map \; (\rsimp{} \; \circ \; (\lambda r. \rderssimp{r}{xs})))) ) }{\varnothing})} $ + + + And we define an "grewrite" relation that works on lists: \begin{center} \begin{tabular}{lcl}