ChengsongTanPhdThesis/Chapters/Finite.tex
changeset 663 0d1e68268d0f
parent 662 fa7552396c71
child 668 3831621d7b14
equal deleted inserted replaced
662:fa7552396c71 663:0d1e68268d0f
   202 
   202 
   203 
   203 
   204 \subsection{Overview of the Proof}
   204 \subsection{Overview of the Proof}
   205 \marginpar{trying to make it more intuitive
   205 \marginpar{trying to make it more intuitive
   206 and provide more insights into proof}
   206 and provide more insights into proof}
   207 The most important intuition is what we call the "closed forms" of
   207 The most important idea in this chapter %intuition 
       
   208 is what we call the "closed forms" of
   208 regular expression derivatives with respect to strings.
   209 regular expression derivatives with respect to strings.
       
   210 In short it allows us to express $r \backslash_{rsimps} s$
       
   211 as a different recursive function so induction on the size bound can go through.
       
   212 A simple induction on $s$ or $r$ fails for $r\backslash_{rsimps} s$, but 
       
   213 works for $\textit{ClosedForm}(r,s)$.
       
   214 
       
   215 
       
   216 
   209 Assume we have a regular expression $r$, be it an alternative,
   217 Assume we have a regular expression $r$, be it an alternative,
   210 a sequence or a star, the idea is if we try to take several derivatives
   218 a sequence or a star, the idea is if we try to take several derivatives
   211 of it on paper, we end up getting a list of subexpressions,
   219 of it on paper, we end up getting a list of subexpressions,
   212 something like
   220 something like
   213 %omitting certain
   221 %omitting certain
   216 	r\backslash s = r_1 + r_2 + r_3 + \ldots + r_n,
   224 	r\backslash s = r_1 + r_2 + r_3 + \ldots + r_n,
   217 \]
   225 \]
   218 if we omit the way these regular expressions need to be nested.
   226 if we omit the way these regular expressions need to be nested.
   219 where each $r_i$ ($i \in \{1, \ldots, n\}$) is related to some fragments
   227 where each $r_i$ ($i \in \{1, \ldots, n\}$) is related to some fragments
   220 of $r$ and $s$.
   228 of $r$ and $s$.
   221 We call the precise formalisation for the shape of 
       
   222 \[
       
   223  r_1 + r_2 + r_3 + \ldots + r_n
       
   224 \]
       
   225 "closed form".
       
   226 The second important observation is that the list %of regular expressions
   229 The second important observation is that the list %of regular expressions
   227 $[r_1, \ldots, r_n]$ %is not
   230 $[r_1, \ldots, r_n]$ %is not
   228 cannot grow indefinitely because they all come from $r$, and derivatives
   231 cannot grow indefinitely because they all come from $r$, and derivatives
   229 of the same regular expression are finite up to some isomorphisms.
   232 of the same regular expression are finite up to some isomorphisms.
   230 We prove that the simplifications of $\blexersimp$ %make use of 
   233 We prove that the simplifications of $\blexersimp$ %make use of 
   231 is powerful enough to counteract the effect of nested structure of alternatives
   234 is powerful enough to counteract the effect of nested structure of alternatives
   232 and eliminate duplicates
   235 and eliminate duplicates
   233 such that indeed the list in $a\backslash s$ does not grow unbounded.
   236 such that indeed the list in $a\backslash s$ does not grow unbounded.
       
   237 We call the precise formalisation for the shape of 
       
   238 \[
       
   239  r_1 + r_2 + r_3 + \ldots + r_n
       
   240 \]
       
   241 "closed form".
       
   242 The name was chosen because turning the recursive relation
       
   243 \[
       
   244 	a \backslash_{bsimps} (c\!::\!s) \dn (\textit{bsimp} \; (a\backslash c)) \backslash_{bsimps} s
       
   245 \]
       
   246 into some easier-to-estimate forms 
       
   247 like
       
   248 \[
       
   249 	\sum (a_1\backslash s \cdot a_2) :: (\map \; (a_2\backslash\_) \;
       
   250 	(\textit{Suffix} \; s \; a_1))
       
   251 	%\backslash_{bsimp
       
   252 \]
       
   253 was reminiscent of 
       
   254 %similar to t
       
   255 solving recurrence relations like 
       
   256 $T \; n = 2 (T \frac{1}{2} n) + n$ to obtain
       
   257 their closed forms.
       
   258 %$T \; n = n \ln n + (s \; n)$ ($s \; n$ is 
       
   259 %some higher-order terms).
       
   260 %(for example we know $T$ is $\Theta (n \ln n)$).
       
   261 Just like a closed form of a recursive definition makes estimating 
       
   262 their growth possible, the closed 
       
   263 form of $a \backslash_{bsimps} s$
       
   264 allows us to prove the existence of a size bound.
       
   265 Note that \ref{eq:approx} is only an approximate
       
   266 term to show our point.
       
   267 The precise formalised formula (\ref{seqClosedForm})
       
   268 needs to wait until all $\textit{rrexp}$-related
       
   269 definitions are given, 
       
   270 %but for now we can think of the above as "the sequence
       
   271 %regular expression $a_1 \cdot a_2$ after derivatives and simplifications
       
   272 %w.r.t string $s$ looks like
       
   273 %an alternative of giant list of sub-expressions, where each 
       
   274 
       
   275 
   234 
   276 
   235 A high-level overview of the main components of the finiteness proof
   277 A high-level overview of the main components of the finiteness proof
   236 is as follows:
   278 is as follows:
   237 \begin{figure}[H]
   279 \begin{figure}[H]
   238 	\begin{tikzpicture}[scale=1,font=\bf,
   280 	\begin{tikzpicture}[scale=1,font=\bf,