ChengsongTanPhdThesis/Chapters/Finite.tex
changeset 553 0f00d440f484
parent 543 b2bea5968b89
child 554 15d182ffbc76
equal deleted inserted replaced
552:3ea82d8f0aa4 553:0f00d440f484
   204 (in file BasicIdentities.thy) for the exact definition.
   204 (in file BasicIdentities.thy) for the exact definition.
   205 With $\rrexp$ the size caclulation of annotated regular expressions'
   205 With $\rrexp$ the size caclulation of annotated regular expressions'
   206 simplification and derivatives can be done by the size of their unlifted 
   206 simplification and derivatives can be done by the size of their unlifted 
   207 counterpart with the unlifted version of simplification and derivatives applied.
   207 counterpart with the unlifted version of simplification and derivatives applied.
   208 \begin{lemma}
   208 \begin{lemma}
       
   209 	The following equalities hold:
   209 	\begin{itemize}
   210 	\begin{itemize}
   210 		\item
   211 		\item
   211 $\asize{\bsimp{a}} = \rsize{\rsimp{\rerase{a}}}$
   212 $\asize{\bsimp{a}} = \rsize{\rsimp{\rerase{a}}}$
   212 \item
   213 \item
   213 $\asize{\bderssimp{r}{s}} =  \rsize{\rderssimp{\rerase{r}}{s}}$
   214 $\asize{\bderssimp{r}{s}} =  \rsize{\rderssimp{\rerase{r}}{s}}$
   226 
   227 
   227 
   228 
   228 %-----------------------------------
   229 %-----------------------------------
   229 %	SECTION ?
   230 %	SECTION ?
   230 %-----------------------------------
   231 %-----------------------------------
   231  \section{Roadmap to a Finiteness Proof}
   232  \subsection{Finiteness Proof Using $\rrexp$s}
   232  Now that we have defined the $\rrexp$ datatype, and proven that its size changes
   233  Now that we have defined the $\rrexp$ datatype, and proven that its size changes
   233  w.r.t derivatives and simplifications mirrors precisely those of annotated regular expressions,
   234  w.r.t derivatives and simplifications mirrors precisely those of annotated regular expressions,
   234  we aim to bound the size of $r \backslash s$ for any $\rrexp$  $r$.
   235  we aim to bound the size of $r \backslash s$ for any $\rrexp$  $r$.
   235 The way we do it is by two steps:
   236  Once we have a bound like: 
       
   237  \[
       
   238 	 \llbracket r \backslash_{rsimp} s \rrbracket_r \leq N_r
       
   239  \]
       
   240  \noindent
       
   241  we could easily extend that to 
       
   242  \[
       
   243 	 \llbracket a \backslash_{bsimps} s \rrbracket \leq N_r.
       
   244  \]
       
   245 
       
   246  \subsection{Roadmap to a Bound for $\textit{Rrexp}$}
       
   247 
       
   248 The way we obtain the bound for $\rrexp$s is by two steps:
   236 \begin{itemize}
   249 \begin{itemize}
   237 	\item
   250 	\item
   238 		First, we rewrite $r\backslash s$ into something else that is easier
   251 		First, we rewrite $r\backslash s$ into something else that is easier
   239 		to bound. This step is especially important for the inductive case 
   252 		to bound. This step is especially important for the inductive case 
   240 		$r_1 \cdot r_2$ and $r^*$, where the derivative can grow and bloat in a wild way,
   253 		$r_1 \cdot r_2$ and $r^*$, where the derivative can grow and bloat in a wild way,
   241 		but after simplification they will always be equal or smaller to a form consisting of an alternative
   254 		but after simplification they will always be equal or smaller to a form consisting of an alternative
   242 		list of regular expressions $f \; (g\; (\sum rs))$ with some functions applied to it, where each element will be distinct after the function application.
   255 		list of regular expressions $f \; (g\; (\sum rs))$ with some functions applied to it, where each element will be distinct after the function application.
   243 		The functions $f$ and $g$ are usually from $\flts$ $\distinctBy$ and other appearing in $\bsimp{_}$.
       
   244 		This way we get a different but equivalent way of expressing : $r\backslash s = f \; (g\; (\sum rs))$, we call the
       
   245 		right hand side the "closed form" of $r\backslash s$.
       
   246 	\item
   256 	\item
   247 		Then, for such a sum  list of regular expressions $f\; (g\; (\sum rs))$, we can control its size
   257 		Then, for such a sum  list of regular expressions $f\; (g\; (\sum rs))$, we can control its size
   248 		by estimation, since $\distinctBy$ and $\flts$ are well-behaved and working together would only 
   258 		by estimation, since $\distinctBy$ and $\flts$ are well-behaved and working together would only 
   249 		reduce the size of a regular expression, not adding to it. The closed form $f\; (g\; (\sum rs)) $ is controlled
   259 		reduce the size of a regular expression, not adding to it.
   250 		by the size of $rs'$, where every element in $rs'$ is distinct, and each element can be described by som e inductive sub-structures (for example when $r = r_1 \cdot r_2$ then $rs'$ will be solely comprised of $r_1 \backslash s'$ and $r_2 \backslash s''$, $s'$ and $s''$ being sub-strings of $s$), which will have a numeric uppder bound according to inductive hypothesis, which controls $r \backslash s$.
       
   251 \end{itemize}
   260 \end{itemize}
   252 
   261 
   253 \section{Closed Forms}
   262 \section{Step One: Closed Forms}
   254 
   263 		We transform the function application $\rderssimp{r}{s}$
       
   264 		into an equivalent form $f\; (g \; (\sum rs))$.
       
   265 		The functions $f$ and $g$ can be anything from $\flts$, $\distinctBy$ and other helper functions from $\bsimp{\_}$.
       
   266 		This way we get a different but equivalent way of expressing : $r\backslash s = f \; (g\; (\sum rs))$, we call the
       
   267 		right hand side the "closed form" of $r\backslash s$.
   255 \subsection{Basic Properties needed for Closed Forms}
   268 \subsection{Basic Properties needed for Closed Forms}
   256 
   269 
   257 \subsubsection{$\textit{rdistinct}$'s Deduplicates Successfully}
   270 \subsubsection{$\textit{rdistinct}$'s Deduplicates Successfully}
   258 The $\textit{rdistinct}$ function, as its name suggests, will
   271 The $\textit{rdistinct}$ function, as its name suggests, will
   259 remove duplicates according to the accumulator
   272 remove duplicates in an \emph{r}$\textit{rexp}$ list, 
       
   273 according to the accumulator
   260 and leave only one of each different element in a list:
   274 and leave only one of each different element in a list:
   261 \begin{lemma}
   275 \begin{lemma}
   262 	The function $\textit{rdistinct}$ satisfies the following
   276 	The function $\textit{rdistinct}$ satisfies the following
   263 	properties:
   277 	properties:
   264 	\begin{itemize}
   278 	\begin{itemize}
   312 for any prefix of an input list, in other words, when can 
   326 for any prefix of an input list, in other words, when can 
   313 we ``push out" the arguments of $\rdistinct{\_}{\_}$:
   327 we ``push out" the arguments of $\rdistinct{\_}{\_}$:
   314 \begin{lemma}
   328 \begin{lemma}
   315 	If $rs_1$'s elements are all unique, and not appearing in the accumulator $acc$, 
   329 	If $rs_1$'s elements are all unique, and not appearing in the accumulator $acc$, 
   316 	then it can be taken out and left untouched in the output:
   330 	then it can be taken out and left untouched in the output:
   317 	\[\textit{rdistinct}\;  rs_1 @ rsa\;\, acc
   331 	\[\textit{rdistinct}\;  (rs_1 @ rsa)\;\, acc
   318 	= rs_1@(\textit{rdistinct} rsa \; (acc \cup rs_1)\]
   332 	= rs_1@(\textit{rdistinct} rsa \; (acc \cup rs_1))\]
   319 \end{lemma}
   333 \end{lemma}
   320 
   334 
   321 \begin{proof}
   335 \begin{proof}
   322 By an induction on $rs_1$, where $rsa$ and $acc$ are allowed to be arbitrary.
   336 By an induction on $rs_1$, where $rsa$ and $acc$ are allowed to be arbitrary.
   323 \end{proof}
   337 \end{proof}
   324 \subsubsection{The properties of $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS}$} 
   338 \subsubsection{The Properties of $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS}$} 
   325 We give in this subsection some properties of how $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS} $ interact with each other and  with $@$, the concatenation operator.
   339 We give in this subsection some properties of how $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS} $ interact with each other and with $@$, the concatenation operator.
   326 These will be helpful in later closed form proofs, when
   340 These will be helpful in later closed form proofs, when
   327 we want to transform the ways in which multiple functions involving
   341 we want to transform the ways in which multiple functions involving
   328 those are composed together
   342 those are composed together
   329 in interleaving derivative and  simplification steps.
   343 in interleaving derivative and  simplification steps.
   330 
   344 
   404 \end{lemma}
   418 \end{lemma}
   405 
   419 
   406 Fortunately this is provable by induction on the list $rs$
   420 Fortunately this is provable by induction on the list $rs$
   407 
   421 
   408 
   422 
   409 
   423 \section{Estimating the Closed Forms' sizes}
       
   424 
       
   425 		The closed form $f\; (g\; (\sum rs)) $ is controlled
       
   426 		by the size of $rs'$, where every element in $rs'$ is distinct, and each element can be described by som e inductive sub-structures (for example when $r = r_1 \cdot r_2$ then $rs'$ will be solely comprised of $r_1 \backslash s'$ and $r_2 \backslash s''$, $s'$ and $s''$ being sub-strings of $s$), which will have a numeric uppder bound according to inductive hypothesis, which controls $r \backslash s$.
   410 %-----------------------------------
   427 %-----------------------------------
   411 %	SECTION 2
   428 %	SECTION 2
   412 %-----------------------------------
   429 %-----------------------------------
   413 
   430 
   414 \begin{theorem}
   431 \begin{theorem}