ChengsongTanPhdThesis/Chapters/Cubic.tex
changeset 625 b797c9a709d9
parent 621 17c7611fb0a9
child 628 7af4e2420a8c
equal deleted inserted replaced
624:8ffa28fce271 625:b797c9a709d9
     6 %in Chapter 4 to a polynomial one, and demonstrate how one can extend the
     6 %in Chapter 4 to a polynomial one, and demonstrate how one can extend the
     7 %algorithm to include constructs such as bounded repetitions and negations.
     7 %algorithm to include constructs such as bounded repetitions and negations.
     8 \lstset{style=myScalastyle}
     8 \lstset{style=myScalastyle}
     9 
     9 
    10 
    10 
    11 This chapter is a ``miscellaneous''
    11 This chapter is a ``work-in-progress''
    12 chapter which records various
    12 chapter which records
    13 extensions to our $\blexersimp$'s formalisations.\\
    13 extensions to our $\blexersimp$.
    14 Firstly we present further improvements
    14 We intend to formalise this part, which
       
    15 we have not been able to finish due to time constraints of the PhD.
       
    16 Nevertheless, we outline the ideas we intend to use for the proof.
       
    17 
       
    18 We present further improvements
    15 made to our lexer algorithm $\blexersimp$.
    19 made to our lexer algorithm $\blexersimp$.
    16 We devise a stronger simplification algorithm,
    20 We devise a stronger simplification algorithm,
    17 called $\bsimpStrong$, which can prune away
    21 called $\bsimpStrong$, which can prune away
    18 similar components in two regular expressions at the same 
    22 similar components in two regular expressions at the same 
    19 alternative level,
    23 alternative level,
    33 We give reasons why the correctness and cubic size bound proofs
    37 We give reasons why the correctness and cubic size bound proofs
    34 can be achieved,
    38 can be achieved,
    35 by exploring the connection between the internal
    39 by exploring the connection between the internal
    36 data structure of our $\blexerStrong$ and
    40 data structure of our $\blexerStrong$ and
    37 Animirov's partial derivatives.\\
    41 Animirov's partial derivatives.\\
    38 Secondly, we extend our $\blexersimp$
       
    39 to support bounded repetitions ($r^{\{n\}}$).
       
    40 We update our formalisation of 
       
    41 the correctness and finiteness properties to
       
    42 include this new construct. 
       
    43 we can out-compete other verified lexers such as
       
    44 Verbatim++ on bounded regular expressions.
       
    45 %We also present the idempotency property proof
    42 %We also present the idempotency property proof
    46 %of $\bsimp$, which leverages the idempotency proof of $\rsimp$.
    43 %of $\bsimp$, which leverages the idempotency proof of $\rsimp$.
    47 %This reinforces our claim that the fixpoint construction
    44 %This reinforces our claim that the fixpoint construction
    48 %originally required by Sulzmann and Lu can be removed in $\blexersimp$.
    45 %originally required by Sulzmann and Lu can be removed in $\blexersimp$.
    49 
    46 
   592 
   589 
   593 
   590 
   594 %----------------------------------------------------------------------------------------
   591 %----------------------------------------------------------------------------------------
   595 %	SECTION 2
   592 %	SECTION 2
   596 %----------------------------------------------------------------------------------------
   593 %----------------------------------------------------------------------------------------
   597 
       
   598 \section{Bounded Repetitions}
       
   599 We have promised in chapter \ref{Introduction}
       
   600 that our lexing algorithm can potentially be extended
       
   601 to handle bounded repetitions
       
   602 in natural and elegant ways.
       
   603 Now we fulfill our promise by adding support for 
       
   604 the ``exactly-$n$-times'' bounded regular expression $r^{\{n\}}$.
       
   605 We add clauses in our derivatives-based lexing algorithms (with simplifications)
       
   606 introduced in chapter \ref{Bitcoded2}.
       
   607 
       
   608 \subsection{Augmented Definitions}
       
   609 There are a number of definitions that need to be augmented.
       
   610 The most notable one would be the POSIX rules for $r^{\{n\}}$:
       
   611 \begin{center}
       
   612 	\begin{mathpar}
       
   613 		\inferrule{\forall v \in vs_1. \vdash v:r \land 
       
   614 		|v| \neq []\\ \forall v \in vs_2. \vdash v:r \land |v| = []\\
       
   615 		\textit{length} \; (vs_1 @ vs_2) = n}{\textit{Stars} \;
       
   616 		(vs_1 @ vs_2) : r^{\{n\}} }
       
   617 	\end{mathpar}
       
   618 \end{center}
       
   619 As Ausaf had pointed out \cite{Ausaf},
       
   620 sometimes empty iterations have to be taken to get
       
   621 a match with exactly $n$ repetitions,
       
   622 and hence the $vs_2$ part.
       
   623 
       
   624 Another important definition would be the size:
       
   625 \begin{center}
       
   626 	\begin{tabular}{lcl}
       
   627 		$\llbracket r^{\{n\}} \rrbracket_r$ & $\dn$ & 
       
   628 		$\llbracket r \rrbracket_r + n$\\
       
   629 	\end{tabular}
       
   630 \end{center}
       
   631 \noindent
       
   632 Arguably we should use $\log \; n$ for the size because
       
   633 the number of digits increase logarithmically w.r.t $n$.
       
   634 For simplicity we choose to add the counter directly to the size.
       
   635 
       
   636 The derivative w.r.t a bounded regular expression
       
   637 is given as 
       
   638 \begin{center}
       
   639 	\begin{tabular}{lcl}
       
   640 		$r^{\{n\}} \backslash_r c$ & $\dn$ & 
       
   641 		$r\backslash_r c \cdot r^{\{n-1\}} \;\; \textit{if} \; n \geq 1$\\
       
   642 					   & & $\RZERO \;\quad \quad\quad \quad
       
   643 					   \textit{otherwise}$\\
       
   644 	\end{tabular}
       
   645 \end{center}
       
   646 \noindent
       
   647 For brevity, we sometimes use NTIMES to refer to bounded 
       
   648 regular expressions.
       
   649 The $\mkeps$ function clause for NTIMES would be
       
   650 \begin{center}
       
   651 	\begin{tabular}{lcl}
       
   652 		$\mkeps \; r^{\{n\}} $ & $\dn$ & $\Stars \;
       
   653 		(\textit{replicate} \; n\; (\mkeps \; r))$
       
   654 	\end{tabular}
       
   655 \end{center}
       
   656 \noindent
       
   657 The injection looks like
       
   658 \begin{center}
       
   659 	\begin{tabular}{lcl}
       
   660 		$\inj \; r^{\{n\}} \; c\; (\Seq \;v \; (\Stars \; vs)) $ & 
       
   661 		$\dn$ & $\Stars \;
       
   662 		((\inj \; r \;c \;v ) :: vs)$
       
   663 	\end{tabular}
       
   664 \end{center}
       
   665 \noindent
       
   666 
       
   667 
       
   668 \subsection{Proofs for the Augmented Lexing Algorithm}
       
   669 We need to maintain two proofs with the additional $r^{\{n\}}$
       
   670 construct: the 
       
   671 correctness proof in chapter \ref{Bitcoded2},
       
   672 and the finiteness proof in chapter \ref{Finite}.
       
   673 
       
   674 \subsubsection{Correctness Proof Augmentation}
       
   675 The correctness of $\textit{lexer}$ and $\textit{blexer}$ with bounded repetitions
       
   676 have been proven by Ausaf and Urban\cite{AusafDyckhoffUrban2016}.
       
   677 As they have commented, once the definitions are in place,
       
   678 the proofs given for the basic regular expressions will extend to
       
   679 bounded regular expressions, and there are no ``surprises''.
       
   680 We confirm this point because the correctness theorem would also
       
   681 extend without surprise to $\blexersimp$.
       
   682 The rewrite rules such as $\rightsquigarrow$, $\stackrel{s}{\rightsquigarrow}$ and so on
       
   683 do not need to be changed,
       
   684 and only a few lemmas such as lemma \ref{fltsPreserves} need to be adjusted to 
       
   685 add one more line which can be solved by sledgehammer 
       
   686 to solve the $r^{\{n\}}$ inductive case.
       
   687 
       
   688 
       
   689 \subsubsection{Finiteness Proof Augmentation}
       
   690 The bounded repetitions are
       
   691 very similar to stars, and therefore the treatment
       
   692 is similar, with minor changes to handle some slight complications
       
   693 when the counter reaches 0.
       
   694 The exponential growth is similar:
       
   695 \begin{center}
       
   696 	\begin{tabular}{ll}
       
   697 		$r^{\{n\}} $ & $\longrightarrow_{\backslash c}$\\
       
   698 		$(r\backslash c)  \cdot  
       
   699 		r^{\{n - 1\}}*$ & $\longrightarrow_{\backslash c'}$\\
       
   700 		\\
       
   701 		$r \backslash cc'  \cdot r^{\{n - 2\}}* + 
       
   702 		r \backslash c' \cdot r^{\{n - 1\}}*$ &
       
   703 		$\longrightarrow_{\backslash c''}$\\
       
   704 		\\
       
   705 		$(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* + 
       
   706 		r \backslash c''\cdot r^{\{n-1\}}) + 
       
   707 		(r \backslash c'c'' \cdot r^{\{n-2\}}* + 
       
   708 		r \backslash c'' \cdot r^{\{n-1\}}*)$ & 
       
   709 		$\longrightarrow_{\backslash c'''}$ \\
       
   710 		\\
       
   711 		$\ldots$\\
       
   712 	\end{tabular}
       
   713 \end{center}
       
   714 Again, we assume that $r\backslash c$, $r \backslash cc'$ and so on
       
   715 are all nullable.
       
   716 The flattened list of terms for $r^{\{n\}} \backslash_{rs} s$
       
   717 \begin{center}
       
   718 	$[r_1 \backslash cc'c'' \cdot r^{\{n-3\}}*,\;
       
   719 	r \backslash c''\cdot r^{\{n-1\}}, \; 
       
   720 	r \backslash c'c'' \cdot r^{\{n-2\}}*, \;
       
   721 	r \backslash c'' \cdot r^{\{n-1\}}*,\; \ldots ]$  
       
   722 \end{center}
       
   723 that comes from 
       
   724 \begin{center}
       
   725 		$(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* + 
       
   726 		r \backslash c''\cdot r^{\{n-1\}}) + 
       
   727 		(r \backslash c'c'' \cdot r^{\{n-2\}}* + 
       
   728 		r \backslash c'' \cdot r^{\{n-1\}}*)+ \ldots$ 
       
   729 \end{center}
       
   730 are made of sequences with different tails, where the counters
       
   731 might differ.
       
   732 The observation for maintaining the bound is that
       
   733 these counters never exceed $n$, the original
       
   734 counter. With the number of counters staying finite,
       
   735 $\rDistinct$ will deduplicate and keep the list finite.
       
   736 We introduce this idea as a lemma once we describe all
       
   737 the necessary helper functions.
       
   738 
       
   739 Similar to the star case, we want
       
   740 \begin{center}
       
   741 	$\rderssimp{r^{\{n\}}}{s} = \rsimp{\sum rs}$.
       
   742 \end{center}
       
   743 where $rs$
       
   744 shall be in the form of 
       
   745 $\map \; f \; Ss$, where $f$ is a function and
       
   746 $Ss$ a list of objects to act on.
       
   747 For star, the object's datatype is string.
       
   748 The list of strings $Ss$
       
   749 is generated using functions 
       
   750 $\starupdate$ and $\starupdates$.
       
   751 The function that takes a string and returns a regular expression
       
   752 is the anonymous function $
       
   753 (\lambda s'. \; r\backslash s' \cdot r^{\{m\}})$.
       
   754 In the NTIMES setting,
       
   755 the $\starupdate$ and $\starupdates$ functions are replaced by 
       
   756 $\textit{nupdate}$ and $\textit{nupdates}$:
       
   757 \begin{center}
       
   758 	\begin{tabular}{lcl}
       
   759 		$\nupdate \; c \; r \; [] $ & $\dn$ & $[]$\\
       
   760 		$\nupdate \; c \; r \; 
       
   761 		(\Some \; (s, \; n + 1) \; :: \; Ss)$ & $\dn$ & %\\
       
   762 						     $\textit{if} \; 
       
   763 						     (\rnullable \; (r \backslash_{rs} s))$ \\
       
   764 						     & & $\;\;\textit{then} 
       
   765 						     \;\; \Some \; (s @ [c], n + 1) :: \Some \; ([c], n) :: (
       
   766 						     \nupdate \; c \; r \; Ss)$ \\
       
   767 						     & & $\textit{else} \;\; \Some \; (s @ [c], n+1) :: (
       
   768 						     \nupdate \; c \; r \; Ss)$\\
       
   769 		$\nupdate \; c \; r \; (\textit{None} :: Ss)$ & $\dn$ & 
       
   770 		$(\None :: \nupdate  \; c \; r \; Ss)$\\
       
   771 							      & & \\
       
   772 	%\end{tabular}
       
   773 %\end{center}
       
   774 %\begin{center}
       
   775 	%\begin{tabular}{lcl}
       
   776 		$\nupdates \; [] \; r \; Ss$ & $\dn$ & $Ss$\\
       
   777 		$\nupdates \; (c :: cs) \; r \; Ss$ &  $\dn$ &  $\nupdates \; cs \; r \; (
       
   778 		\nupdate \; c \; r \; Ss)$
       
   779 	\end{tabular}
       
   780 \end{center}
       
   781 \noindent
       
   782 which take into account when a subterm
       
   783 \begin{center}
       
   784 	$r \backslash_s s \cdot r^{\{n\}}$
       
   785 \end{center}
       
   786 counter $n$
       
   787 is 0, and therefore expands to 
       
   788 \begin{center}
       
   789 $r \backslash_s (s@[c]) \cdot r^{\{n\}} \;+
       
   790 \; \ZERO$ 
       
   791 \end{center}
       
   792 after taking a derivative.
       
   793 The object now has type 
       
   794 \begin{center}
       
   795 $\textit{option} \;(\textit{string}, \textit{nat})$
       
   796 \end{center}
       
   797 and therefore the function for converting such an option into
       
   798 a regular expression term is called $\opterm$:
       
   799 
       
   800 \begin{center}
       
   801 	\begin{tabular}{lcl}
       
   802 	$\opterm \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
       
   803 				 & & $\;\;\Some \; (s, n) \Rightarrow 
       
   804 				 (r\backslash_{rs} s)\cdot r^{\{n\}}$\\
       
   805 				 & & $\;\;\None  \Rightarrow 
       
   806 				 \ZERO$\\
       
   807 	\end{tabular}
       
   808 \end{center}
       
   809 \noindent
       
   810 Put together, the list $\map \; f \; Ss$ is instantiated as
       
   811 \begin{center}
       
   812 	$\map \; (\opterm \; r) \; (\nupdates \; s \; r \;
       
   813 	[\Some \; ([c], n)])$.
       
   814 \end{center}
       
   815 For the closed form to be bounded, we would like
       
   816 simplification to be applied to each term in the list.
       
   817 Therefore we introduce some variants of $\opterm$,
       
   818 which help conveniently express the rewriting steps 
       
   819 needed in the closed form proof.
       
   820 \begin{center}
       
   821 	\begin{tabular}{lcl}
       
   822 	$\optermOsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
       
   823 				 & & $\;\;\Some \; (s, n) \Rightarrow 
       
   824 				 \textit{rsimp} \; ((r\backslash_{rs} s)\cdot r^{\{n\}})$\\
       
   825 				 & & $\;\;\None  \Rightarrow 
       
   826 				 \ZERO$\\
       
   827 				 \\
       
   828 	$\optermosimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
       
   829 				 & & $\;\;\Some \; (s, n) \Rightarrow 
       
   830 				 (\textit{rsimp} \; (r\backslash_{rs} s)) 
       
   831 				 \cdot r^{\{n\}}$\\
       
   832 				 & & $\;\;\None  \Rightarrow 
       
   833 				 \ZERO$\\
       
   834 				 \\
       
   835 	$\optermsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
       
   836 				 & & $\;\;\Some \; (s, n) \Rightarrow 
       
   837 				 (r\backslash_{rsimps} s)\cdot r^{\{n\}}$\\
       
   838 				 & & $\;\;\None  \Rightarrow 
       
   839 				 \ZERO$\\
       
   840 	\end{tabular}
       
   841 \end{center}
       
   842 
       
   843 
       
   844 For a list of 
       
   845 $\textit{option} \;(\textit{string}, \textit{nat})$ elements,
       
   846 we define the highest power for it recursively:
       
   847 \begin{center}
       
   848 	\begin{tabular}{lcl}
       
   849 		$\hpa \; [] \; n $ & $\dn$ & $n$\\
       
   850 		$\hpa \; (\None :: os) \; n $ &  $\dn$ &  $\hpa \; os \; n$\\
       
   851 		$\hpa \; (\Some \; (s, n) :: os) \; m$ & $\dn$ & 
       
   852 		$\hpa \;os \; (\textit{max} \; n\; m)$\\
       
   853 		\\
       
   854 		$\hpower \; rs $ & $\dn$ & $\hpa \; rs \; 0$\\
       
   855 	\end{tabular}
       
   856 \end{center}
       
   857 
       
   858 Now the intuition that an NTIMES regular expression's power
       
   859 does not increase can be easily expressed as
       
   860 \begin{lemma}\label{nupdatesMono2}
       
   861 	$\hpower \; (\nupdates \;s \; r \; [\Some \; ([c], n)]) \leq n$
       
   862 \end{lemma}
       
   863 \begin{proof}
       
   864 	Note that the power is non-increasing after a $\nupdate$ application:
       
   865 	\begin{center}
       
   866 		$\hpa \;\; (\nupdate \; c \; r \; Ss)\;\; m \leq 
       
   867 		 \hpa\; \; Ss \; m$.
       
   868 	 \end{center}
       
   869 	 This is also the case for $\nupdates$:
       
   870 	\begin{center}
       
   871 		$\hpa \;\; (\nupdates \; s \; r \; Ss)\;\; m \leq 
       
   872 		 \hpa\; \; Ss \; m$.
       
   873 	 \end{center}
       
   874 	 Therefore we have that
       
   875 	 \begin{center}
       
   876 		 $\hpower \;\; (\nupdates \; s \; r \; Ss) \leq
       
   877 		  \hpower \;\; Ss$
       
   878 	 \end{center}
       
   879 	 which leads to the lemma being proven.
       
   880 
       
   881  \end{proof}
       
   882 
       
   883 
       
   884 We also define the inductive rules for
       
   885 the shape of derivatives of the NTIMES regular expressions:\\[-3em]
       
   886 \begin{center}
       
   887 	\begin{mathpar}
       
   888 		\inferrule{\mbox{}}{\cbn \;\ZERO}
       
   889 
       
   890 		\inferrule{\mbox{}}{\cbn \; \; r_a \cdot (r^{\{n\}})}
       
   891 
       
   892 		\inferrule{\cbn \; r_1 \;\; \; \cbn \; r_2}{\cbn \; r_1 + r_2}
       
   893 
       
   894 		\inferrule{\cbn \; r}{\cbn \; r + \ZERO}
       
   895 	\end{mathpar}
       
   896 \end{center}
       
   897 \noindent
       
   898 A derivative of NTIMES fits into the shape described by $\cbn$:
       
   899 \begin{lemma}\label{ntimesDersCbn}
       
   900 	$\cbn \; ((r' \cdot r^{\{n\}}) \backslash_{rs} s)$ holds.
       
   901 \end{lemma}
       
   902 \begin{proof}
       
   903 	By a reverse induction on $s$.
       
   904 	For the inductive case, note that if $\cbn \; r$ holds,
       
   905 	then $\cbn \; (r\backslash_r c)$ holds.
       
   906 \end{proof}
       
   907 \noindent
       
   908 In addition, for $\cbn$-shaped regular expressioins, one can flatten
       
   909 them:
       
   910 \begin{lemma}\label{ntimesHfauPushin}
       
   911 	If $\cbn \; r$ holds, then $\hflataux{r \backslash_r c} = 
       
   912 	\textit{concat} \; (\map \; \hflataux{\map \; (\_\backslash_r c) \;
       
   913 	(\hflataux{r})})$
       
   914 \end{lemma}
       
   915 \begin{proof}
       
   916 	By an induction on the inductive cases of $\cbn$.
       
   917 \end{proof}
       
   918 \noindent
       
   919 This time we do not need to define the flattening functions for NTIMES only,
       
   920 because $\hflat{\_}$ and $\hflataux{\_}$ work on NTIMES already.
       
   921 \begin{lemma}\label{ntimesHfauInduct}
       
   922 $\hflataux{( (r\backslash_r c) \cdot r^{\{n\}}) \backslash_{rsimps} s} = 
       
   923  \map \; (\opterm \; r) \; (\nupdates \; s \; r \; [\Some \; ([c], n)])$
       
   924 \end{lemma}
       
   925 \begin{proof}
       
   926 	By a reverse induction on $s$.
       
   927 	The lemmas \ref{ntimesHfauPushin} and \ref{ntimesDersCbn} are used.
       
   928 \end{proof}
       
   929 \noindent
       
   930 We have a recursive property for NTIMES with $\nupdate$ 
       
   931 similar to that for STAR,
       
   932 and one for $\nupdates $ as well:
       
   933 \begin{lemma}\label{nupdateInduct1}
       
   934 	\mbox{}
       
   935 	\begin{itemize}
       
   936 		\item
       
   937 			\begin{center}
       
   938 	 $\textit{concat} \; (\map \; (\hflataux{\_} \circ (
       
   939 	\opterm \; r)) \; Ss) = \map \; (\opterm \; r) \; (\nupdate \;
       
   940 	c \; r \; Ss)$\\
       
   941 	\end{center}
       
   942 	holds.
       
   943 \item
       
   944 	\begin{center}
       
   945 	 $\textit{concat} \; (\map \; \hflataux{\_}\; 
       
   946 	\map \; (\_\backslash_r x) \;
       
   947 		(\map \; (\opterm \; r) \; (\nupdates \; xs \; r \; Ss)))$\\
       
   948 		$=$\\
       
   949 	$\map \; (\opterm \; r) \; (\nupdates \;(xs@[x]) \; r\;Ss)$ 
       
   950 	\end{center}
       
   951 	holds.
       
   952 	\end{itemize}
       
   953 \end{lemma}
       
   954 \begin{proof}
       
   955 	(i) is by an induction on $Ss$.
       
   956 	(ii) is by an induction on $xs$.
       
   957 \end{proof}
       
   958 \noindent
       
   959 The $\nString$ predicate is defined for conveniently
       
   960 expressing that there are no empty strings in the
       
   961 $\Some \;(s, n)$ elements generated by $\nupdate$:
       
   962 \begin{center}
       
   963 	\begin{tabular}{lcl}
       
   964 		$\nString \; \None$  & $\dn$ & $ \textit{true}$\\
       
   965 		$\nString \; (\Some \; ([], n))$ & $\dn$ & $ \textit{false}$\\
       
   966 		$\nString \; (\Some \; (c::s, n))$  & $\dn$ & $ \textit{true}$\\
       
   967 	\end{tabular}
       
   968 \end{center}
       
   969 \begin{lemma}\label{nupdatesNonempty}
       
   970 	If for all elements $o \in \textit{set} \; Ss$,
       
   971 	$\nString \; o$ holds, the we have that
       
   972 	for all elements $o' \in \textit{set} \; (\nupdates \; s \; r \; Ss)$,
       
   973 	$\nString \; o'$ holds.
       
   974 \end{lemma}
       
   975 \begin{proof}
       
   976 	By an induction on $s$, where $Ss$ is set to vary over all possible values.
       
   977 \end{proof}
       
   978 
       
   979 \noindent
       
   980 
       
   981 \begin{lemma}\label{ntimesClosedFormsSteps}
       
   982 	The following list of equalities or rewriting relations hold:\\
       
   983 	(i) $r^{\{n+1\}} \backslash_{rsimps} (c::s) = 
       
   984 	\textit{rsimp} \; (\sum (\map \; (\opterm \;r \;\_) \; (\nupdates \;
       
   985 	s \; r \; [\Some \; ([c], n)])))$\\
       
   986 	(ii)
       
   987 	\begin{center}
       
   988 	$\sum (\map \; (\opterm \; r) \; (\nupdates \; s \; r \; [
       
   989 	\Some \; ([c], n)]))$ \\ $ \sequal$\\
       
   990 	 $\sum (\map \; (\textit{rsimp} \circ (\opterm \; r))\; (\nupdates \;
       
   991 	 s\;r \; [\Some \; ([c], n)]))$\\
       
   992  	\end{center}
       
   993 	(iii)
       
   994 	\begin{center}
       
   995 	$\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \;
       
   996 	([c], n)]))$\\ 
       
   997 	$\sequal$\\
       
   998 	 $\sum \;(\map \; (\optermsimp r) \; (\nupdates \; s \; r \; [\Some \;
       
   999 	([c], n)])) $\\
       
  1000 	\end{center}
       
  1001 	(iv)
       
  1002 	\begin{center}
       
  1003 	$\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \;
       
  1004 	([c], n)])) $ \\ $\sequal$\\
       
  1005 	 $\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \;
       
  1006 	([c], n)])) $\\
       
  1007 	\end{center}
       
  1008 	(v)
       
  1009 	\begin{center}
       
  1010 	 $\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \;
       
  1011 	 ([c], n)])) $ \\ $\sequal$\\
       
  1012 	  $\sum \; (\map \; (\textit{rsimp} \circ (\opterm \; r)) \;
       
  1013 	  (\nupdates \; s \; r \; [\Some \; ([c], n)]))$
       
  1014   	\end{center}
       
  1015 \end{lemma}
       
  1016 \begin{proof}
       
  1017 	Routine.
       
  1018 	(iii) and (iv) make use of the fact that all the strings $s$
       
  1019 	inside $\Some \; (s, m)$ which are elements of the list
       
  1020 	$\nupdates \; s\;r\;[\Some\; ([c], n)]$ are non-empty,
       
  1021 	which is from lemma \ref{nupdatesNonempty}.
       
  1022 	Once the string in $o = \Some \; (s, n)$ is 
       
  1023 	nonempty, $\optermsimp \; r \;o$,
       
  1024 	$\optermosimp \; r \; o$ and $\optermosimp \; \; o$ are guaranteed
       
  1025 	to be equal.
       
  1026 	(v) uses \ref{nupdateInduct1}.
       
  1027 \end{proof}
       
  1028 \noindent
       
  1029 Now we are ready to present the closed form for NTIMES:
       
  1030 \begin{theorem}\label{ntimesClosedForm}
       
  1031 	The derivative of $r^{\{n+1\}}$ can be described as an alternative
       
  1032 	containing a list
       
  1033 	of terms:\\
       
  1034 	$r^{\{n+1\}} \backslash_{rsimps} (c::s) = \textit{rsimp} \; (
       
  1035 	\sum (\map \; (\optermsimp \; r) \; (\nupdates \; s \; r \;
       
  1036 	[\Some \; ([c], n)])))$
       
  1037 \end{theorem}
       
  1038 \begin{proof}
       
  1039 	By the rewriting steps described in lemma \ref{ntimesClosedFormsSteps}.
       
  1040 \end{proof}
       
  1041 \noindent
       
  1042 The key observation for bounding this closed form
       
  1043 is that the counter on $r^{\{n\}}$ will 
       
  1044 only decrement during derivatives:
       
  1045 \begin{lemma}\label{nupdatesNLeqN}
       
  1046 	For an element $o$ in $\textit{set} \; (\nupdates \; s \; r \;
       
  1047 	[\Some \; ([c], n)])$, either $o = \None$, or $o = \Some
       
  1048 	\; (s', m)$ for some string $s'$ and number $m \leq n$.
       
  1049 \end{lemma}
       
  1050 \noindent
       
  1051 The proof is routine and therefore omitted.
       
  1052 This allows us to say what kind of terms
       
  1053 are in the list $\textit{set} \; (\map \; (\optermsimp \; r) \; (
       
  1054 \nupdates \; s \; r \; [\Some \; ([c], n)]))$:
       
  1055 only $\ZERO_r$s or a sequence with the tail an $r^{\{m\}}$
       
  1056 with a small $m$:
       
  1057 \begin{lemma}\label{ntimesClosedFormListElemShape}
       
  1058 	For any element $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; (
       
  1059 	\nupdates \; s \; r \; [\Some \; ([c], n)]))$,
       
  1060 	we have that $r'$ is either $\ZERO$ or $r \backslash_{rsimps} s' \cdot
       
  1061 	r^{\{m\}}$ for some string $s'$ and number $m \leq n$.
       
  1062 \end{lemma}
       
  1063 \begin{proof}
       
  1064 	Using lemma \ref{nupdatesNLeqN}.
       
  1065 \end{proof}
       
  1066 
       
  1067 \begin{theorem}\label{ntimesClosedFormBounded}
       
  1068 	Assuming that for any string $s$, $\llbracket r \backslash_{rsimps} s
       
  1069 	\rrbracket_r \leq N$ holds, then we have that\\
       
  1070 	$\llbracket r^{\{n+1\}} \backslash_{rsimps} s \rrbracket_r \leq
       
  1071 	\textit{max} \; (c_N+1)* (N + \llbracket r^{\{n\}} \rrbracket+1)$,
       
  1072 	where $c_N = \textit{card} \; (\textit{sizeNregex} \; (
       
  1073 	N + \llbracket r^{\{n\}} \rrbracket_r+1))$.
       
  1074 \end{theorem}
       
  1075 \begin{proof}
       
  1076 We have that for all regular expressions $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; (
       
  1077 	\nupdates \; s \; r \; [\Some \; ([c], n)]))$,
       
  1078 	$r'$'s size is less than or equal to $N + \llbracket r^{\{n\}} 
       
  1079 	\rrbracket_r + 1$
       
  1080 because $r'$ can only be either a $\ZERO$ or $r \backslash_{rsimps} s' \cdot
       
  1081 r^{\{m\}}$ for some string $s'$ and number 
       
  1082 $m \leq n$ (lemma \ref{ntimesClosedFormListElemShape}).
       
  1083 In addition, we know that the list 
       
  1084 $\map \; (\optermsimp \; r) \; (
       
  1085 \nupdates \; s \; r \; [\Some \; ([c], n)])$'s size is at most
       
  1086 $c_N = \textit{card} \; 
       
  1087 (\sizeNregex \; ((N + \llbracket r^{\{n\}} \rrbracket) + 1))$.
       
  1088 This gives us $\llbracket r \backslash_{rsimps} \;s \rrbracket_r
       
  1089 \leq N * c_N$.
       
  1090 \end{proof}
       
  1091 
       
  1092 We aim to formalise the correctness and size bound
       
  1093 for constructs like $r^{\{\ldots n\}}$, $r^{\{n \ldots\}}$
       
  1094 and so on, which is still work in progress.
       
  1095 They should more or less follow the same recipe described in this section.
       
  1096 Once we know about how to deal with them recursively using suitable auxiliary
       
  1097 definitions, we are able to routinely establish the proofs.
       
  1098 
   594 
  1099 
   595 
  1100 %The closed form for them looks like:
   596 %The closed form for them looks like:
  1101 %%\begin{center}
   597 %%\begin{center}
  1102 %%	\begin{tabular}{llrclll}
   598 %%	\begin{tabular}{llrclll}