ChengsongTanPhdThesis/Chapters/Finite.tex
changeset 625 b797c9a709d9
parent 624 8ffa28fce271
child 638 dd9dde2d902b
equal deleted inserted replaced
624:8ffa28fce271 625:b797c9a709d9
    43 		The bound is universal for a given regular expression, 
    43 		The bound is universal for a given regular expression, 
    44 		which is an advantage over work which 
    44 		which is an advantage over work which 
    45 		only gives empirical evidence on 
    45 		only gives empirical evidence on 
    46 		some test cases (see Verbatim++ work\cite{Verbatimpp}).
    46 		some test cases (see Verbatim++ work\cite{Verbatimpp}).
    47 \end{itemize}
    47 \end{itemize}
       
    48 \noindent
       
    49 We then extend our $\blexersimp$
       
    50 to support bounded repetitions ($r^{\{n\}}$).
       
    51 We update our formalisation of 
       
    52 the correctness and finiteness properties to
       
    53 include this new construct. 
       
    54 we can out-compete other verified lexers such as
       
    55 Verbatim++ on bounded regular expressions.
       
    56 
    48 In the next section we describe in more detail
    57 In the next section we describe in more detail
    49 what the finite bound means in our algorithm
    58 what the finite bound means in our algorithm
    50 and why the size of the internal data structures of
    59 and why the size of the internal data structures of
    51 a typical derivative-based lexer such as
    60 a typical derivative-based lexer such as
    52 Sulzmann and Lu's need formal treatment.
    61 Sulzmann and Lu's need formal treatment.
       
    62 
       
    63 
       
    64 
    53 
    65 
    54 \section{Formalising About Size}
    66 \section{Formalising About Size}
    55 \noindent
    67 \noindent
    56 In our lexer ($\blexersimp$),
    68 In our lexer ($\blexersimp$),
    57 we take an annotated regular expression as input,
    69 we take an annotated regular expression as input,
  2589 
  2601 
  2590 
  2602 
  2591 %-----------------------------------
  2603 %-----------------------------------
  2592 %	SECTION 2
  2604 %	SECTION 2
  2593 %-----------------------------------
  2605 %-----------------------------------
       
  2606 
       
  2607 \section{Bounded Repetitions}
       
  2608 We have promised in chapter \ref{Introduction}
       
  2609 that our lexing algorithm can potentially be extended
       
  2610 to handle bounded repetitions
       
  2611 in natural and elegant ways.
       
  2612 Now we fulfill our promise by adding support for 
       
  2613 the ``exactly-$n$-times'' bounded regular expression $r^{\{n\}}$.
       
  2614 We add clauses in our derivatives-based lexing algorithms (with simplifications)
       
  2615 introduced in chapter \ref{Bitcoded2}.
       
  2616 
       
  2617 \subsection{Augmented Definitions}
       
  2618 There are a number of definitions that need to be augmented.
       
  2619 The most notable one would be the POSIX rules for $r^{\{n\}}$:
       
  2620 \begin{center}
       
  2621 	\begin{mathpar}
       
  2622 		\inferrule{\forall v \in vs_1. \vdash v:r \land 
       
  2623 		|v| \neq []\\ \forall v \in vs_2. \vdash v:r \land |v| = []\\
       
  2624 		\textit{length} \; (vs_1 @ vs_2) = n}{\textit{Stars} \;
       
  2625 		(vs_1 @ vs_2) : r^{\{n\}} }
       
  2626 	\end{mathpar}
       
  2627 \end{center}
       
  2628 As Ausaf had pointed out \cite{Ausaf},
       
  2629 sometimes empty iterations have to be taken to get
       
  2630 a match with exactly $n$ repetitions,
       
  2631 and hence the $vs_2$ part.
       
  2632 
       
  2633 Another important definition would be the size:
       
  2634 \begin{center}
       
  2635 	\begin{tabular}{lcl}
       
  2636 		$\llbracket r^{\{n\}} \rrbracket_r$ & $\dn$ & 
       
  2637 		$\llbracket r \rrbracket_r + n$\\
       
  2638 	\end{tabular}
       
  2639 \end{center}
       
  2640 \noindent
       
  2641 Arguably we should use $\log \; n$ for the size because
       
  2642 the number of digits increase logarithmically w.r.t $n$.
       
  2643 For simplicity we choose to add the counter directly to the size.
       
  2644 
       
  2645 The derivative w.r.t a bounded regular expression
       
  2646 is given as 
       
  2647 \begin{center}
       
  2648 	\begin{tabular}{lcl}
       
  2649 		$r^{\{n\}} \backslash_r c$ & $\dn$ & 
       
  2650 		$r\backslash_r c \cdot r^{\{n-1\}} \;\; \textit{if} \; n \geq 1$\\
       
  2651 					   & & $\RZERO \;\quad \quad\quad \quad
       
  2652 					   \textit{otherwise}$\\
       
  2653 	\end{tabular}
       
  2654 \end{center}
       
  2655 \noindent
       
  2656 For brevity, we sometimes use NTIMES to refer to bounded 
       
  2657 regular expressions.
       
  2658 The $\mkeps$ function clause for NTIMES would be
       
  2659 \begin{center}
       
  2660 	\begin{tabular}{lcl}
       
  2661 		$\mkeps \; r^{\{n\}} $ & $\dn$ & $\Stars \;
       
  2662 		(\textit{replicate} \; n\; (\mkeps \; r))$
       
  2663 	\end{tabular}
       
  2664 \end{center}
       
  2665 \noindent
       
  2666 The injection looks like
       
  2667 \begin{center}
       
  2668 	\begin{tabular}{lcl}
       
  2669 		$\inj \; r^{\{n\}} \; c\; (\Seq \;v \; (\Stars \; vs)) $ & 
       
  2670 		$\dn$ & $\Stars \;
       
  2671 		((\inj \; r \;c \;v ) :: vs)$
       
  2672 	\end{tabular}
       
  2673 \end{center}
       
  2674 \noindent
       
  2675 
       
  2676 
       
  2677 \subsection{Proofs for the Augmented Lexing Algorithm}
       
  2678 We need to maintain two proofs with the additional $r^{\{n\}}$
       
  2679 construct: the 
       
  2680 correctness proof in chapter \ref{Bitcoded2},
       
  2681 and the finiteness proof in chapter \ref{Finite}.
       
  2682 
       
  2683 \subsubsection{Correctness Proof Augmentation}
       
  2684 The correctness of $\textit{lexer}$ and $\textit{blexer}$ with bounded repetitions
       
  2685 have been proven by Ausaf and Urban\cite{AusafDyckhoffUrban2016}.
       
  2686 As they have commented, once the definitions are in place,
       
  2687 the proofs given for the basic regular expressions will extend to
       
  2688 bounded regular expressions, and there are no ``surprises''.
       
  2689 We confirm this point because the correctness theorem would also
       
  2690 extend without surprise to $\blexersimp$.
       
  2691 The rewrite rules such as $\rightsquigarrow$, $\stackrel{s}{\rightsquigarrow}$ and so on
       
  2692 do not need to be changed,
       
  2693 and only a few lemmas such as lemma \ref{fltsPreserves} need to be adjusted to 
       
  2694 add one more line which can be solved by sledgehammer 
       
  2695 to solve the $r^{\{n\}}$ inductive case.
       
  2696 
       
  2697 
       
  2698 \subsubsection{Finiteness Proof Augmentation}
       
  2699 The bounded repetitions are
       
  2700 very similar to stars, and therefore the treatment
       
  2701 is similar, with minor changes to handle some slight complications
       
  2702 when the counter reaches 0.
       
  2703 The exponential growth is similar:
       
  2704 \begin{center}
       
  2705 	\begin{tabular}{ll}
       
  2706 		$r^{\{n\}} $ & $\longrightarrow_{\backslash c}$\\
       
  2707 		$(r\backslash c)  \cdot  
       
  2708 		r^{\{n - 1\}}*$ & $\longrightarrow_{\backslash c'}$\\
       
  2709 		\\
       
  2710 		$r \backslash cc'  \cdot r^{\{n - 2\}}* + 
       
  2711 		r \backslash c' \cdot r^{\{n - 1\}}*$ &
       
  2712 		$\longrightarrow_{\backslash c''}$\\
       
  2713 		\\
       
  2714 		$(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* + 
       
  2715 		r \backslash c''\cdot r^{\{n-1\}}) + 
       
  2716 		(r \backslash c'c'' \cdot r^{\{n-2\}}* + 
       
  2717 		r \backslash c'' \cdot r^{\{n-1\}}*)$ & 
       
  2718 		$\longrightarrow_{\backslash c'''}$ \\
       
  2719 		\\
       
  2720 		$\ldots$\\
       
  2721 	\end{tabular}
       
  2722 \end{center}
       
  2723 Again, we assume that $r\backslash c$, $r \backslash cc'$ and so on
       
  2724 are all nullable.
       
  2725 The flattened list of terms for $r^{\{n\}} \backslash_{rs} s$
       
  2726 \begin{center}
       
  2727 	$[r_1 \backslash cc'c'' \cdot r^{\{n-3\}}*,\;
       
  2728 	r \backslash c''\cdot r^{\{n-1\}}, \; 
       
  2729 	r \backslash c'c'' \cdot r^{\{n-2\}}*, \;
       
  2730 	r \backslash c'' \cdot r^{\{n-1\}}*,\; \ldots ]$  
       
  2731 \end{center}
       
  2732 that comes from 
       
  2733 \begin{center}
       
  2734 		$(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* + 
       
  2735 		r \backslash c''\cdot r^{\{n-1\}}) + 
       
  2736 		(r \backslash c'c'' \cdot r^{\{n-2\}}* + 
       
  2737 		r \backslash c'' \cdot r^{\{n-1\}}*)+ \ldots$ 
       
  2738 \end{center}
       
  2739 are made of sequences with different tails, where the counters
       
  2740 might differ.
       
  2741 The observation for maintaining the bound is that
       
  2742 these counters never exceed $n$, the original
       
  2743 counter. With the number of counters staying finite,
       
  2744 $\rDistinct$ will deduplicate and keep the list finite.
       
  2745 We introduce this idea as a lemma once we describe all
       
  2746 the necessary helper functions.
       
  2747 
       
  2748 Similar to the star case, we want
       
  2749 \begin{center}
       
  2750 	$\rderssimp{r^{\{n\}}}{s} = \rsimp{\sum rs}$.
       
  2751 \end{center}
       
  2752 where $rs$
       
  2753 shall be in the form of 
       
  2754 $\map \; f \; Ss$, where $f$ is a function and
       
  2755 $Ss$ a list of objects to act on.
       
  2756 For star, the object's datatype is string.
       
  2757 The list of strings $Ss$
       
  2758 is generated using functions 
       
  2759 $\starupdate$ and $\starupdates$.
       
  2760 The function that takes a string and returns a regular expression
       
  2761 is the anonymous function $
       
  2762 (\lambda s'. \; r\backslash s' \cdot r^{\{m\}})$.
       
  2763 In the NTIMES setting,
       
  2764 the $\starupdate$ and $\starupdates$ functions are replaced by 
       
  2765 $\textit{nupdate}$ and $\textit{nupdates}$:
       
  2766 \begin{center}
       
  2767 	\begin{tabular}{lcl}
       
  2768 		$\nupdate \; c \; r \; [] $ & $\dn$ & $[]$\\
       
  2769 		$\nupdate \; c \; r \; 
       
  2770 		(\Some \; (s, \; n + 1) \; :: \; Ss)$ & $\dn$ & %\\
       
  2771 						     $\textit{if} \; 
       
  2772 						     (\rnullable \; (r \backslash_{rs} s))$ \\
       
  2773 						     & & $\;\;\textit{then} 
       
  2774 						     \;\; \Some \; (s @ [c], n + 1) :: \Some \; ([c], n) :: (
       
  2775 						     \nupdate \; c \; r \; Ss)$ \\
       
  2776 						     & & $\textit{else} \;\; \Some \; (s @ [c], n+1) :: (
       
  2777 						     \nupdate \; c \; r \; Ss)$\\
       
  2778 		$\nupdate \; c \; r \; (\textit{None} :: Ss)$ & $\dn$ & 
       
  2779 		$(\None :: \nupdate  \; c \; r \; Ss)$\\
       
  2780 							      & & \\
       
  2781 	%\end{tabular}
       
  2782 %\end{center}
       
  2783 %\begin{center}
       
  2784 	%\begin{tabular}{lcl}
       
  2785 		$\nupdates \; [] \; r \; Ss$ & $\dn$ & $Ss$\\
       
  2786 		$\nupdates \; (c :: cs) \; r \; Ss$ &  $\dn$ &  $\nupdates \; cs \; r \; (
       
  2787 		\nupdate \; c \; r \; Ss)$
       
  2788 	\end{tabular}
       
  2789 \end{center}
       
  2790 \noindent
       
  2791 which take into account when a subterm
       
  2792 \begin{center}
       
  2793 	$r \backslash_s s \cdot r^{\{n\}}$
       
  2794 \end{center}
       
  2795 counter $n$
       
  2796 is 0, and therefore expands to 
       
  2797 \begin{center}
       
  2798 $r \backslash_s (s@[c]) \cdot r^{\{n\}} \;+
       
  2799 \; \ZERO$ 
       
  2800 \end{center}
       
  2801 after taking a derivative.
       
  2802 The object now has type 
       
  2803 \begin{center}
       
  2804 $\textit{option} \;(\textit{string}, \textit{nat})$
       
  2805 \end{center}
       
  2806 and therefore the function for converting such an option into
       
  2807 a regular expression term is called $\opterm$:
       
  2808 
       
  2809 \begin{center}
       
  2810 	\begin{tabular}{lcl}
       
  2811 	$\opterm \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
       
  2812 				 & & $\;\;\Some \; (s, n) \Rightarrow 
       
  2813 				 (r\backslash_{rs} s)\cdot r^{\{n\}}$\\
       
  2814 				 & & $\;\;\None  \Rightarrow 
       
  2815 				 \ZERO$\\
       
  2816 	\end{tabular}
       
  2817 \end{center}
       
  2818 \noindent
       
  2819 Put together, the list $\map \; f \; Ss$ is instantiated as
       
  2820 \begin{center}
       
  2821 	$\map \; (\opterm \; r) \; (\nupdates \; s \; r \;
       
  2822 	[\Some \; ([c], n)])$.
       
  2823 \end{center}
       
  2824 For the closed form to be bounded, we would like
       
  2825 simplification to be applied to each term in the list.
       
  2826 Therefore we introduce some variants of $\opterm$,
       
  2827 which help conveniently express the rewriting steps 
       
  2828 needed in the closed form proof.
       
  2829 \begin{center}
       
  2830 	\begin{tabular}{lcl}
       
  2831 	$\optermOsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
       
  2832 				 & & $\;\;\Some \; (s, n) \Rightarrow 
       
  2833 				 \textit{rsimp} \; ((r\backslash_{rs} s)\cdot r^{\{n\}})$\\
       
  2834 				 & & $\;\;\None  \Rightarrow 
       
  2835 				 \ZERO$\\
       
  2836 				 \\
       
  2837 	$\optermosimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
       
  2838 				 & & $\;\;\Some \; (s, n) \Rightarrow 
       
  2839 				 (\textit{rsimp} \; (r\backslash_{rs} s)) 
       
  2840 				 \cdot r^{\{n\}}$\\
       
  2841 				 & & $\;\;\None  \Rightarrow 
       
  2842 				 \ZERO$\\
       
  2843 				 \\
       
  2844 	$\optermsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\
       
  2845 				 & & $\;\;\Some \; (s, n) \Rightarrow 
       
  2846 				 (r\backslash_{rsimps} s)\cdot r^{\{n\}}$\\
       
  2847 				 & & $\;\;\None  \Rightarrow 
       
  2848 				 \ZERO$\\
       
  2849 	\end{tabular}
       
  2850 \end{center}
       
  2851 
       
  2852 
       
  2853 For a list of 
       
  2854 $\textit{option} \;(\textit{string}, \textit{nat})$ elements,
       
  2855 we define the highest power for it recursively:
       
  2856 \begin{center}
       
  2857 	\begin{tabular}{lcl}
       
  2858 		$\hpa \; [] \; n $ & $\dn$ & $n$\\
       
  2859 		$\hpa \; (\None :: os) \; n $ &  $\dn$ &  $\hpa \; os \; n$\\
       
  2860 		$\hpa \; (\Some \; (s, n) :: os) \; m$ & $\dn$ & 
       
  2861 		$\hpa \;os \; (\textit{max} \; n\; m)$\\
       
  2862 		\\
       
  2863 		$\hpower \; rs $ & $\dn$ & $\hpa \; rs \; 0$\\
       
  2864 	\end{tabular}
       
  2865 \end{center}
       
  2866 
       
  2867 Now the intuition that an NTIMES regular expression's power
       
  2868 does not increase can be easily expressed as
       
  2869 \begin{lemma}\label{nupdatesMono2}
       
  2870 	$\hpower \; (\nupdates \;s \; r \; [\Some \; ([c], n)]) \leq n$
       
  2871 \end{lemma}
       
  2872 \begin{proof}
       
  2873 	Note that the power is non-increasing after a $\nupdate$ application:
       
  2874 	\begin{center}
       
  2875 		$\hpa \;\; (\nupdate \; c \; r \; Ss)\;\; m \leq 
       
  2876 		 \hpa\; \; Ss \; m$.
       
  2877 	 \end{center}
       
  2878 	 This is also the case for $\nupdates$:
       
  2879 	\begin{center}
       
  2880 		$\hpa \;\; (\nupdates \; s \; r \; Ss)\;\; m \leq 
       
  2881 		 \hpa\; \; Ss \; m$.
       
  2882 	 \end{center}
       
  2883 	 Therefore we have that
       
  2884 	 \begin{center}
       
  2885 		 $\hpower \;\; (\nupdates \; s \; r \; Ss) \leq
       
  2886 		  \hpower \;\; Ss$
       
  2887 	 \end{center}
       
  2888 	 which leads to the lemma being proven.
       
  2889 
       
  2890  \end{proof}
       
  2891 
       
  2892 
       
  2893 We also define the inductive rules for
       
  2894 the shape of derivatives of the NTIMES regular expressions:\\[-3em]
       
  2895 \begin{center}
       
  2896 	\begin{mathpar}
       
  2897 		\inferrule{\mbox{}}{\cbn \;\ZERO}
       
  2898 
       
  2899 		\inferrule{\mbox{}}{\cbn \; \; r_a \cdot (r^{\{n\}})}
       
  2900 
       
  2901 		\inferrule{\cbn \; r_1 \;\; \; \cbn \; r_2}{\cbn \; r_1 + r_2}
       
  2902 
       
  2903 		\inferrule{\cbn \; r}{\cbn \; r + \ZERO}
       
  2904 	\end{mathpar}
       
  2905 \end{center}
       
  2906 \noindent
       
  2907 A derivative of NTIMES fits into the shape described by $\cbn$:
       
  2908 \begin{lemma}\label{ntimesDersCbn}
       
  2909 	$\cbn \; ((r' \cdot r^{\{n\}}) \backslash_{rs} s)$ holds.
       
  2910 \end{lemma}
       
  2911 \begin{proof}
       
  2912 	By a reverse induction on $s$.
       
  2913 	For the inductive case, note that if $\cbn \; r$ holds,
       
  2914 	then $\cbn \; (r\backslash_r c)$ holds.
       
  2915 \end{proof}
       
  2916 \noindent
       
  2917 In addition, for $\cbn$-shaped regular expressioins, one can flatten
       
  2918 them:
       
  2919 \begin{lemma}\label{ntimesHfauPushin}
       
  2920 	If $\cbn \; r$ holds, then $\hflataux{r \backslash_r c} = 
       
  2921 	\textit{concat} \; (\map \; \hflataux{\map \; (\_\backslash_r c) \;
       
  2922 	(\hflataux{r})})$
       
  2923 \end{lemma}
       
  2924 \begin{proof}
       
  2925 	By an induction on the inductive cases of $\cbn$.
       
  2926 \end{proof}
       
  2927 \noindent
       
  2928 This time we do not need to define the flattening functions for NTIMES only,
       
  2929 because $\hflat{\_}$ and $\hflataux{\_}$ work on NTIMES already.
       
  2930 \begin{lemma}\label{ntimesHfauInduct}
       
  2931 $\hflataux{( (r\backslash_r c) \cdot r^{\{n\}}) \backslash_{rsimps} s} = 
       
  2932  \map \; (\opterm \; r) \; (\nupdates \; s \; r \; [\Some \; ([c], n)])$
       
  2933 \end{lemma}
       
  2934 \begin{proof}
       
  2935 	By a reverse induction on $s$.
       
  2936 	The lemmas \ref{ntimesHfauPushin} and \ref{ntimesDersCbn} are used.
       
  2937 \end{proof}
       
  2938 \noindent
       
  2939 We have a recursive property for NTIMES with $\nupdate$ 
       
  2940 similar to that for STAR,
       
  2941 and one for $\nupdates $ as well:
       
  2942 \begin{lemma}\label{nupdateInduct1}
       
  2943 	\mbox{}
       
  2944 	\begin{itemize}
       
  2945 		\item
       
  2946 			\begin{center}
       
  2947 	 $\textit{concat} \; (\map \; (\hflataux{\_} \circ (
       
  2948 	\opterm \; r)) \; Ss) = \map \; (\opterm \; r) \; (\nupdate \;
       
  2949 	c \; r \; Ss)$\\
       
  2950 	\end{center}
       
  2951 	holds.
       
  2952 \item
       
  2953 	\begin{center}
       
  2954 	 $\textit{concat} \; (\map \; \hflataux{\_}\; 
       
  2955 	\map \; (\_\backslash_r x) \;
       
  2956 		(\map \; (\opterm \; r) \; (\nupdates \; xs \; r \; Ss)))$\\
       
  2957 		$=$\\
       
  2958 	$\map \; (\opterm \; r) \; (\nupdates \;(xs@[x]) \; r\;Ss)$ 
       
  2959 	\end{center}
       
  2960 	holds.
       
  2961 	\end{itemize}
       
  2962 \end{lemma}
       
  2963 \begin{proof}
       
  2964 	(i) is by an induction on $Ss$.
       
  2965 	(ii) is by an induction on $xs$.
       
  2966 \end{proof}
       
  2967 \noindent
       
  2968 The $\nString$ predicate is defined for conveniently
       
  2969 expressing that there are no empty strings in the
       
  2970 $\Some \;(s, n)$ elements generated by $\nupdate$:
       
  2971 \begin{center}
       
  2972 	\begin{tabular}{lcl}
       
  2973 		$\nString \; \None$  & $\dn$ & $ \textit{true}$\\
       
  2974 		$\nString \; (\Some \; ([], n))$ & $\dn$ & $ \textit{false}$\\
       
  2975 		$\nString \; (\Some \; (c::s, n))$  & $\dn$ & $ \textit{true}$\\
       
  2976 	\end{tabular}
       
  2977 \end{center}
       
  2978 \begin{lemma}\label{nupdatesNonempty}
       
  2979 	If for all elements $o \in \textit{set} \; Ss$,
       
  2980 	$\nString \; o$ holds, the we have that
       
  2981 	for all elements $o' \in \textit{set} \; (\nupdates \; s \; r \; Ss)$,
       
  2982 	$\nString \; o'$ holds.
       
  2983 \end{lemma}
       
  2984 \begin{proof}
       
  2985 	By an induction on $s$, where $Ss$ is set to vary over all possible values.
       
  2986 \end{proof}
       
  2987 
       
  2988 \noindent
       
  2989 
       
  2990 \begin{lemma}\label{ntimesClosedFormsSteps}
       
  2991 	The following list of equalities or rewriting relations hold:\\
       
  2992 	(i) $r^{\{n+1\}} \backslash_{rsimps} (c::s) = 
       
  2993 	\textit{rsimp} \; (\sum (\map \; (\opterm \;r \;\_) \; (\nupdates \;
       
  2994 	s \; r \; [\Some \; ([c], n)])))$\\
       
  2995 	(ii)
       
  2996 	\begin{center}
       
  2997 	$\sum (\map \; (\opterm \; r) \; (\nupdates \; s \; r \; [
       
  2998 	\Some \; ([c], n)]))$ \\ $ \sequal$\\
       
  2999 	 $\sum (\map \; (\textit{rsimp} \circ (\opterm \; r))\; (\nupdates \;
       
  3000 	 s\;r \; [\Some \; ([c], n)]))$\\
       
  3001  	\end{center}
       
  3002 	(iii)
       
  3003 	\begin{center}
       
  3004 	$\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \;
       
  3005 	([c], n)]))$\\ 
       
  3006 	$\sequal$\\
       
  3007 	 $\sum \;(\map \; (\optermsimp r) \; (\nupdates \; s \; r \; [\Some \;
       
  3008 	([c], n)])) $\\
       
  3009 	\end{center}
       
  3010 	(iv)
       
  3011 	\begin{center}
       
  3012 	$\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \;
       
  3013 	([c], n)])) $ \\ $\sequal$\\
       
  3014 	 $\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \;
       
  3015 	([c], n)])) $\\
       
  3016 	\end{center}
       
  3017 	(v)
       
  3018 	\begin{center}
       
  3019 	 $\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \;
       
  3020 	 ([c], n)])) $ \\ $\sequal$\\
       
  3021 	  $\sum \; (\map \; (\textit{rsimp} \circ (\opterm \; r)) \;
       
  3022 	  (\nupdates \; s \; r \; [\Some \; ([c], n)]))$
       
  3023   	\end{center}
       
  3024 \end{lemma}
       
  3025 \begin{proof}
       
  3026 	Routine.
       
  3027 	(iii) and (iv) make use of the fact that all the strings $s$
       
  3028 	inside $\Some \; (s, m)$ which are elements of the list
       
  3029 	$\nupdates \; s\;r\;[\Some\; ([c], n)]$ are non-empty,
       
  3030 	which is from lemma \ref{nupdatesNonempty}.
       
  3031 	Once the string in $o = \Some \; (s, n)$ is 
       
  3032 	nonempty, $\optermsimp \; r \;o$,
       
  3033 	$\optermosimp \; r \; o$ and $\optermosimp \; \; o$ are guaranteed
       
  3034 	to be equal.
       
  3035 	(v) uses \ref{nupdateInduct1}.
       
  3036 \end{proof}
       
  3037 \noindent
       
  3038 Now we are ready to present the closed form for NTIMES:
       
  3039 \begin{theorem}\label{ntimesClosedForm}
       
  3040 	The derivative of $r^{\{n+1\}}$ can be described as an alternative
       
  3041 	containing a list
       
  3042 	of terms:\\
       
  3043 	$r^{\{n+1\}} \backslash_{rsimps} (c::s) = \textit{rsimp} \; (
       
  3044 	\sum (\map \; (\optermsimp \; r) \; (\nupdates \; s \; r \;
       
  3045 	[\Some \; ([c], n)])))$
       
  3046 \end{theorem}
       
  3047 \begin{proof}
       
  3048 	By the rewriting steps described in lemma \ref{ntimesClosedFormsSteps}.
       
  3049 \end{proof}
       
  3050 \noindent
       
  3051 The key observation for bounding this closed form
       
  3052 is that the counter on $r^{\{n\}}$ will 
       
  3053 only decrement during derivatives:
       
  3054 \begin{lemma}\label{nupdatesNLeqN}
       
  3055 	For an element $o$ in $\textit{set} \; (\nupdates \; s \; r \;
       
  3056 	[\Some \; ([c], n)])$, either $o = \None$, or $o = \Some
       
  3057 	\; (s', m)$ for some string $s'$ and number $m \leq n$.
       
  3058 \end{lemma}
       
  3059 \noindent
       
  3060 The proof is routine and therefore omitted.
       
  3061 This allows us to say what kind of terms
       
  3062 are in the list $\textit{set} \; (\map \; (\optermsimp \; r) \; (
       
  3063 \nupdates \; s \; r \; [\Some \; ([c], n)]))$:
       
  3064 only $\ZERO_r$s or a sequence with the tail an $r^{\{m\}}$
       
  3065 with a small $m$:
       
  3066 \begin{lemma}\label{ntimesClosedFormListElemShape}
       
  3067 	For any element $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; (
       
  3068 	\nupdates \; s \; r \; [\Some \; ([c], n)]))$,
       
  3069 	we have that $r'$ is either $\ZERO$ or $r \backslash_{rsimps} s' \cdot
       
  3070 	r^{\{m\}}$ for some string $s'$ and number $m \leq n$.
       
  3071 \end{lemma}
       
  3072 \begin{proof}
       
  3073 	Using lemma \ref{nupdatesNLeqN}.
       
  3074 \end{proof}
       
  3075 
       
  3076 \begin{theorem}\label{ntimesClosedFormBounded}
       
  3077 	Assuming that for any string $s$, $\llbracket r \backslash_{rsimps} s
       
  3078 	\rrbracket_r \leq N$ holds, then we have that\\
       
  3079 	$\llbracket r^{\{n+1\}} \backslash_{rsimps} s \rrbracket_r \leq
       
  3080 	\textit{max} \; (c_N+1)* (N + \llbracket r^{\{n\}} \rrbracket+1)$,
       
  3081 	where $c_N = \textit{card} \; (\textit{sizeNregex} \; (
       
  3082 	N + \llbracket r^{\{n\}} \rrbracket_r+1))$.
       
  3083 \end{theorem}
       
  3084 \begin{proof}
       
  3085 We have that for all regular expressions $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; (
       
  3086 	\nupdates \; s \; r \; [\Some \; ([c], n)]))$,
       
  3087 	$r'$'s size is less than or equal to $N + \llbracket r^{\{n\}} 
       
  3088 	\rrbracket_r + 1$
       
  3089 because $r'$ can only be either a $\ZERO$ or $r \backslash_{rsimps} s' \cdot
       
  3090 r^{\{m\}}$ for some string $s'$ and number 
       
  3091 $m \leq n$ (lemma \ref{ntimesClosedFormListElemShape}).
       
  3092 In addition, we know that the list 
       
  3093 $\map \; (\optermsimp \; r) \; (
       
  3094 \nupdates \; s \; r \; [\Some \; ([c], n)])$'s size is at most
       
  3095 $c_N = \textit{card} \; 
       
  3096 (\sizeNregex \; ((N + \llbracket r^{\{n\}} \rrbracket) + 1))$.
       
  3097 This gives us $\llbracket r \backslash_{rsimps} \;s \rrbracket_r
       
  3098 \leq N * c_N$.
       
  3099 \end{proof}
       
  3100 
       
  3101 We aim to formalise the correctness and size bound
       
  3102 for constructs like $r^{\{\ldots n\}}$, $r^{\{n \ldots\}}$
       
  3103 and so on, which is still work in progress.
       
  3104 They should more or less follow the same recipe described in this section.
       
  3105 Once we know about how to deal with them recursively using suitable auxiliary
       
  3106 definitions, we are able to routinely establish the proofs.
  2594 
  3107 
  2595 
  3108 
  2596 %----------------------------------------------------------------------------------------
  3109 %----------------------------------------------------------------------------------------
  2597 %	SECTION 3
  3110 %	SECTION 3
  2598 %----------------------------------------------------------------------------------------
  3111 %----------------------------------------------------------------------------------------