226 |
227 |
227 |
228 |
228 %----------------------------------- |
229 %----------------------------------- |
229 % SECTION ? |
230 % SECTION ? |
230 %----------------------------------- |
231 %----------------------------------- |
231 \section{Roadmap to a Finiteness Proof} |
232 \subsection{Finiteness Proof Using $\rrexp$s} |
232 Now that we have defined the $\rrexp$ datatype, and proven that its size changes |
233 Now that we have defined the $\rrexp$ datatype, and proven that its size changes |
233 w.r.t derivatives and simplifications mirrors precisely those of annotated regular expressions, |
234 w.r.t derivatives and simplifications mirrors precisely those of annotated regular expressions, |
234 we aim to bound the size of $r \backslash s$ for any $\rrexp$ $r$. |
235 we aim to bound the size of $r \backslash s$ for any $\rrexp$ $r$. |
235 The way we do it is by two steps: |
236 Once we have a bound like: |
|
237 \[ |
|
238 \llbracket r \backslash_{rsimp} s \rrbracket_r \leq N_r |
|
239 \] |
|
240 \noindent |
|
241 we could easily extend that to |
|
242 \[ |
|
243 \llbracket a \backslash_{bsimps} s \rrbracket \leq N_r. |
|
244 \] |
|
245 |
|
246 \subsection{Roadmap to a Bound for $\textit{Rrexp}$} |
|
247 |
|
248 The way we obtain the bound for $\rrexp$s is by two steps: |
236 \begin{itemize} |
249 \begin{itemize} |
237 \item |
250 \item |
238 First, we rewrite $r\backslash s$ into something else that is easier |
251 First, we rewrite $r\backslash s$ into something else that is easier |
239 to bound. This step is especially important for the inductive case |
252 to bound. This step is especially important for the inductive case |
240 $r_1 \cdot r_2$ and $r^*$, where the derivative can grow and bloat in a wild way, |
253 $r_1 \cdot r_2$ and $r^*$, where the derivative can grow and bloat in a wild way, |
241 but after simplification they will always be equal or smaller to a form consisting of an alternative |
254 but after simplification they will always be equal or smaller to a form consisting of an alternative |
242 list of regular expressions $f \; (g\; (\sum rs))$ with some functions applied to it, where each element will be distinct after the function application. |
255 list of regular expressions $f \; (g\; (\sum rs))$ with some functions applied to it, where each element will be distinct after the function application. |
243 The functions $f$ and $g$ are usually from $\flts$ $\distinctBy$ and other appearing in $\bsimp{_}$. |
|
244 This way we get a different but equivalent way of expressing : $r\backslash s = f \; (g\; (\sum rs))$, we call the |
|
245 right hand side the "closed form" of $r\backslash s$. |
|
246 \item |
256 \item |
247 Then, for such a sum list of regular expressions $f\; (g\; (\sum rs))$, we can control its size |
257 Then, for such a sum list of regular expressions $f\; (g\; (\sum rs))$, we can control its size |
248 by estimation, since $\distinctBy$ and $\flts$ are well-behaved and working together would only |
258 by estimation, since $\distinctBy$ and $\flts$ are well-behaved and working together would only |
249 reduce the size of a regular expression, not adding to it. The closed form $f\; (g\; (\sum rs)) $ is controlled |
259 reduce the size of a regular expression, not adding to it. |
250 by the size of $rs'$, where every element in $rs'$ is distinct, and each element can be described by som e inductive sub-structures (for example when $r = r_1 \cdot r_2$ then $rs'$ will be solely comprised of $r_1 \backslash s'$ and $r_2 \backslash s''$, $s'$ and $s''$ being sub-strings of $s$), which will have a numeric uppder bound according to inductive hypothesis, which controls $r \backslash s$. |
|
251 \end{itemize} |
260 \end{itemize} |
252 |
261 |
253 \section{Closed Forms} |
262 \section{Step One: Closed Forms} |
254 |
263 We transform the function application $\rderssimp{r}{s}$ |
|
264 into an equivalent form $f\; (g \; (\sum rs))$. |
|
265 The functions $f$ and $g$ can be anything from $\flts$, $\distinctBy$ and other helper functions from $\bsimp{\_}$. |
|
266 This way we get a different but equivalent way of expressing : $r\backslash s = f \; (g\; (\sum rs))$, we call the |
|
267 right hand side the "closed form" of $r\backslash s$. |
255 \subsection{Basic Properties needed for Closed Forms} |
268 \subsection{Basic Properties needed for Closed Forms} |
256 |
269 |
257 \subsubsection{$\textit{rdistinct}$'s Deduplicates Successfully} |
270 \subsubsection{$\textit{rdistinct}$'s Deduplicates Successfully} |
258 The $\textit{rdistinct}$ function, as its name suggests, will |
271 The $\textit{rdistinct}$ function, as its name suggests, will |
259 remove duplicates according to the accumulator |
272 remove duplicates in an \emph{r}$\textit{rexp}$ list, |
|
273 according to the accumulator |
260 and leave only one of each different element in a list: |
274 and leave only one of each different element in a list: |
261 \begin{lemma} |
275 \begin{lemma} |
262 The function $\textit{rdistinct}$ satisfies the following |
276 The function $\textit{rdistinct}$ satisfies the following |
263 properties: |
277 properties: |
264 \begin{itemize} |
278 \begin{itemize} |
312 for any prefix of an input list, in other words, when can |
326 for any prefix of an input list, in other words, when can |
313 we ``push out" the arguments of $\rdistinct{\_}{\_}$: |
327 we ``push out" the arguments of $\rdistinct{\_}{\_}$: |
314 \begin{lemma} |
328 \begin{lemma} |
315 If $rs_1$'s elements are all unique, and not appearing in the accumulator $acc$, |
329 If $rs_1$'s elements are all unique, and not appearing in the accumulator $acc$, |
316 then it can be taken out and left untouched in the output: |
330 then it can be taken out and left untouched in the output: |
317 \[\textit{rdistinct}\; rs_1 @ rsa\;\, acc |
331 \[\textit{rdistinct}\; (rs_1 @ rsa)\;\, acc |
318 = rs_1@(\textit{rdistinct} rsa \; (acc \cup rs_1)\] |
332 = rs_1@(\textit{rdistinct} rsa \; (acc \cup rs_1))\] |
319 \end{lemma} |
333 \end{lemma} |
320 |
334 |
321 \begin{proof} |
335 \begin{proof} |
322 By an induction on $rs_1$, where $rsa$ and $acc$ are allowed to be arbitrary. |
336 By an induction on $rs_1$, where $rsa$ and $acc$ are allowed to be arbitrary. |
323 \end{proof} |
337 \end{proof} |
324 \subsubsection{The properties of $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS}$} |
338 \subsubsection{The Properties of $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS}$} |
325 We give in this subsection some properties of how $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS} $ interact with each other and with $@$, the concatenation operator. |
339 We give in this subsection some properties of how $\backslash_r$, $\backslash_{rsimp}$, $\textit{Rflts}$ and $\textit{Rsimp}_{ALTS} $ interact with each other and with $@$, the concatenation operator. |
326 These will be helpful in later closed form proofs, when |
340 These will be helpful in later closed form proofs, when |
327 we want to transform the ways in which multiple functions involving |
341 we want to transform the ways in which multiple functions involving |
328 those are composed together |
342 those are composed together |
329 in interleaving derivative and simplification steps. |
343 in interleaving derivative and simplification steps. |
330 |
344 |