2589 |
2601 |
2590 |
2602 |
2591 %----------------------------------- |
2603 %----------------------------------- |
2592 % SECTION 2 |
2604 % SECTION 2 |
2593 %----------------------------------- |
2605 %----------------------------------- |
|
2606 |
|
2607 \section{Bounded Repetitions} |
|
2608 We have promised in chapter \ref{Introduction} |
|
2609 that our lexing algorithm can potentially be extended |
|
2610 to handle bounded repetitions |
|
2611 in natural and elegant ways. |
|
2612 Now we fulfill our promise by adding support for |
|
2613 the ``exactly-$n$-times'' bounded regular expression $r^{\{n\}}$. |
|
2614 We add clauses in our derivatives-based lexing algorithms (with simplifications) |
|
2615 introduced in chapter \ref{Bitcoded2}. |
|
2616 |
|
2617 \subsection{Augmented Definitions} |
|
2618 There are a number of definitions that need to be augmented. |
|
2619 The most notable one would be the POSIX rules for $r^{\{n\}}$: |
|
2620 \begin{center} |
|
2621 \begin{mathpar} |
|
2622 \inferrule{\forall v \in vs_1. \vdash v:r \land |
|
2623 |v| \neq []\\ \forall v \in vs_2. \vdash v:r \land |v| = []\\ |
|
2624 \textit{length} \; (vs_1 @ vs_2) = n}{\textit{Stars} \; |
|
2625 (vs_1 @ vs_2) : r^{\{n\}} } |
|
2626 \end{mathpar} |
|
2627 \end{center} |
|
2628 As Ausaf had pointed out \cite{Ausaf}, |
|
2629 sometimes empty iterations have to be taken to get |
|
2630 a match with exactly $n$ repetitions, |
|
2631 and hence the $vs_2$ part. |
|
2632 |
|
2633 Another important definition would be the size: |
|
2634 \begin{center} |
|
2635 \begin{tabular}{lcl} |
|
2636 $\llbracket r^{\{n\}} \rrbracket_r$ & $\dn$ & |
|
2637 $\llbracket r \rrbracket_r + n$\\ |
|
2638 \end{tabular} |
|
2639 \end{center} |
|
2640 \noindent |
|
2641 Arguably we should use $\log \; n$ for the size because |
|
2642 the number of digits increase logarithmically w.r.t $n$. |
|
2643 For simplicity we choose to add the counter directly to the size. |
|
2644 |
|
2645 The derivative w.r.t a bounded regular expression |
|
2646 is given as |
|
2647 \begin{center} |
|
2648 \begin{tabular}{lcl} |
|
2649 $r^{\{n\}} \backslash_r c$ & $\dn$ & |
|
2650 $r\backslash_r c \cdot r^{\{n-1\}} \;\; \textit{if} \; n \geq 1$\\ |
|
2651 & & $\RZERO \;\quad \quad\quad \quad |
|
2652 \textit{otherwise}$\\ |
|
2653 \end{tabular} |
|
2654 \end{center} |
|
2655 \noindent |
|
2656 For brevity, we sometimes use NTIMES to refer to bounded |
|
2657 regular expressions. |
|
2658 The $\mkeps$ function clause for NTIMES would be |
|
2659 \begin{center} |
|
2660 \begin{tabular}{lcl} |
|
2661 $\mkeps \; r^{\{n\}} $ & $\dn$ & $\Stars \; |
|
2662 (\textit{replicate} \; n\; (\mkeps \; r))$ |
|
2663 \end{tabular} |
|
2664 \end{center} |
|
2665 \noindent |
|
2666 The injection looks like |
|
2667 \begin{center} |
|
2668 \begin{tabular}{lcl} |
|
2669 $\inj \; r^{\{n\}} \; c\; (\Seq \;v \; (\Stars \; vs)) $ & |
|
2670 $\dn$ & $\Stars \; |
|
2671 ((\inj \; r \;c \;v ) :: vs)$ |
|
2672 \end{tabular} |
|
2673 \end{center} |
|
2674 \noindent |
|
2675 |
|
2676 |
|
2677 \subsection{Proofs for the Augmented Lexing Algorithm} |
|
2678 We need to maintain two proofs with the additional $r^{\{n\}}$ |
|
2679 construct: the |
|
2680 correctness proof in chapter \ref{Bitcoded2}, |
|
2681 and the finiteness proof in chapter \ref{Finite}. |
|
2682 |
|
2683 \subsubsection{Correctness Proof Augmentation} |
|
2684 The correctness of $\textit{lexer}$ and $\textit{blexer}$ with bounded repetitions |
|
2685 have been proven by Ausaf and Urban\cite{AusafDyckhoffUrban2016}. |
|
2686 As they have commented, once the definitions are in place, |
|
2687 the proofs given for the basic regular expressions will extend to |
|
2688 bounded regular expressions, and there are no ``surprises''. |
|
2689 We confirm this point because the correctness theorem would also |
|
2690 extend without surprise to $\blexersimp$. |
|
2691 The rewrite rules such as $\rightsquigarrow$, $\stackrel{s}{\rightsquigarrow}$ and so on |
|
2692 do not need to be changed, |
|
2693 and only a few lemmas such as lemma \ref{fltsPreserves} need to be adjusted to |
|
2694 add one more line which can be solved by sledgehammer |
|
2695 to solve the $r^{\{n\}}$ inductive case. |
|
2696 |
|
2697 |
|
2698 \subsubsection{Finiteness Proof Augmentation} |
|
2699 The bounded repetitions are |
|
2700 very similar to stars, and therefore the treatment |
|
2701 is similar, with minor changes to handle some slight complications |
|
2702 when the counter reaches 0. |
|
2703 The exponential growth is similar: |
|
2704 \begin{center} |
|
2705 \begin{tabular}{ll} |
|
2706 $r^{\{n\}} $ & $\longrightarrow_{\backslash c}$\\ |
|
2707 $(r\backslash c) \cdot |
|
2708 r^{\{n - 1\}}*$ & $\longrightarrow_{\backslash c'}$\\ |
|
2709 \\ |
|
2710 $r \backslash cc' \cdot r^{\{n - 2\}}* + |
|
2711 r \backslash c' \cdot r^{\{n - 1\}}*$ & |
|
2712 $\longrightarrow_{\backslash c''}$\\ |
|
2713 \\ |
|
2714 $(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* + |
|
2715 r \backslash c''\cdot r^{\{n-1\}}) + |
|
2716 (r \backslash c'c'' \cdot r^{\{n-2\}}* + |
|
2717 r \backslash c'' \cdot r^{\{n-1\}}*)$ & |
|
2718 $\longrightarrow_{\backslash c'''}$ \\ |
|
2719 \\ |
|
2720 $\ldots$\\ |
|
2721 \end{tabular} |
|
2722 \end{center} |
|
2723 Again, we assume that $r\backslash c$, $r \backslash cc'$ and so on |
|
2724 are all nullable. |
|
2725 The flattened list of terms for $r^{\{n\}} \backslash_{rs} s$ |
|
2726 \begin{center} |
|
2727 $[r_1 \backslash cc'c'' \cdot r^{\{n-3\}}*,\; |
|
2728 r \backslash c''\cdot r^{\{n-1\}}, \; |
|
2729 r \backslash c'c'' \cdot r^{\{n-2\}}*, \; |
|
2730 r \backslash c'' \cdot r^{\{n-1\}}*,\; \ldots ]$ |
|
2731 \end{center} |
|
2732 that comes from |
|
2733 \begin{center} |
|
2734 $(r_1 \backslash cc'c'' \cdot r^{\{n-3\}}* + |
|
2735 r \backslash c''\cdot r^{\{n-1\}}) + |
|
2736 (r \backslash c'c'' \cdot r^{\{n-2\}}* + |
|
2737 r \backslash c'' \cdot r^{\{n-1\}}*)+ \ldots$ |
|
2738 \end{center} |
|
2739 are made of sequences with different tails, where the counters |
|
2740 might differ. |
|
2741 The observation for maintaining the bound is that |
|
2742 these counters never exceed $n$, the original |
|
2743 counter. With the number of counters staying finite, |
|
2744 $\rDistinct$ will deduplicate and keep the list finite. |
|
2745 We introduce this idea as a lemma once we describe all |
|
2746 the necessary helper functions. |
|
2747 |
|
2748 Similar to the star case, we want |
|
2749 \begin{center} |
|
2750 $\rderssimp{r^{\{n\}}}{s} = \rsimp{\sum rs}$. |
|
2751 \end{center} |
|
2752 where $rs$ |
|
2753 shall be in the form of |
|
2754 $\map \; f \; Ss$, where $f$ is a function and |
|
2755 $Ss$ a list of objects to act on. |
|
2756 For star, the object's datatype is string. |
|
2757 The list of strings $Ss$ |
|
2758 is generated using functions |
|
2759 $\starupdate$ and $\starupdates$. |
|
2760 The function that takes a string and returns a regular expression |
|
2761 is the anonymous function $ |
|
2762 (\lambda s'. \; r\backslash s' \cdot r^{\{m\}})$. |
|
2763 In the NTIMES setting, |
|
2764 the $\starupdate$ and $\starupdates$ functions are replaced by |
|
2765 $\textit{nupdate}$ and $\textit{nupdates}$: |
|
2766 \begin{center} |
|
2767 \begin{tabular}{lcl} |
|
2768 $\nupdate \; c \; r \; [] $ & $\dn$ & $[]$\\ |
|
2769 $\nupdate \; c \; r \; |
|
2770 (\Some \; (s, \; n + 1) \; :: \; Ss)$ & $\dn$ & %\\ |
|
2771 $\textit{if} \; |
|
2772 (\rnullable \; (r \backslash_{rs} s))$ \\ |
|
2773 & & $\;\;\textit{then} |
|
2774 \;\; \Some \; (s @ [c], n + 1) :: \Some \; ([c], n) :: ( |
|
2775 \nupdate \; c \; r \; Ss)$ \\ |
|
2776 & & $\textit{else} \;\; \Some \; (s @ [c], n+1) :: ( |
|
2777 \nupdate \; c \; r \; Ss)$\\ |
|
2778 $\nupdate \; c \; r \; (\textit{None} :: Ss)$ & $\dn$ & |
|
2779 $(\None :: \nupdate \; c \; r \; Ss)$\\ |
|
2780 & & \\ |
|
2781 %\end{tabular} |
|
2782 %\end{center} |
|
2783 %\begin{center} |
|
2784 %\begin{tabular}{lcl} |
|
2785 $\nupdates \; [] \; r \; Ss$ & $\dn$ & $Ss$\\ |
|
2786 $\nupdates \; (c :: cs) \; r \; Ss$ & $\dn$ & $\nupdates \; cs \; r \; ( |
|
2787 \nupdate \; c \; r \; Ss)$ |
|
2788 \end{tabular} |
|
2789 \end{center} |
|
2790 \noindent |
|
2791 which take into account when a subterm |
|
2792 \begin{center} |
|
2793 $r \backslash_s s \cdot r^{\{n\}}$ |
|
2794 \end{center} |
|
2795 counter $n$ |
|
2796 is 0, and therefore expands to |
|
2797 \begin{center} |
|
2798 $r \backslash_s (s@[c]) \cdot r^{\{n\}} \;+ |
|
2799 \; \ZERO$ |
|
2800 \end{center} |
|
2801 after taking a derivative. |
|
2802 The object now has type |
|
2803 \begin{center} |
|
2804 $\textit{option} \;(\textit{string}, \textit{nat})$ |
|
2805 \end{center} |
|
2806 and therefore the function for converting such an option into |
|
2807 a regular expression term is called $\opterm$: |
|
2808 |
|
2809 \begin{center} |
|
2810 \begin{tabular}{lcl} |
|
2811 $\opterm \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\ |
|
2812 & & $\;\;\Some \; (s, n) \Rightarrow |
|
2813 (r\backslash_{rs} s)\cdot r^{\{n\}}$\\ |
|
2814 & & $\;\;\None \Rightarrow |
|
2815 \ZERO$\\ |
|
2816 \end{tabular} |
|
2817 \end{center} |
|
2818 \noindent |
|
2819 Put together, the list $\map \; f \; Ss$ is instantiated as |
|
2820 \begin{center} |
|
2821 $\map \; (\opterm \; r) \; (\nupdates \; s \; r \; |
|
2822 [\Some \; ([c], n)])$. |
|
2823 \end{center} |
|
2824 For the closed form to be bounded, we would like |
|
2825 simplification to be applied to each term in the list. |
|
2826 Therefore we introduce some variants of $\opterm$, |
|
2827 which help conveniently express the rewriting steps |
|
2828 needed in the closed form proof. |
|
2829 \begin{center} |
|
2830 \begin{tabular}{lcl} |
|
2831 $\optermOsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\ |
|
2832 & & $\;\;\Some \; (s, n) \Rightarrow |
|
2833 \textit{rsimp} \; ((r\backslash_{rs} s)\cdot r^{\{n\}})$\\ |
|
2834 & & $\;\;\None \Rightarrow |
|
2835 \ZERO$\\ |
|
2836 \\ |
|
2837 $\optermosimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\ |
|
2838 & & $\;\;\Some \; (s, n) \Rightarrow |
|
2839 (\textit{rsimp} \; (r\backslash_{rs} s)) |
|
2840 \cdot r^{\{n\}}$\\ |
|
2841 & & $\;\;\None \Rightarrow |
|
2842 \ZERO$\\ |
|
2843 \\ |
|
2844 $\optermsimp \; r \; SN$ & $\dn$ & $\textit{case} \; SN\; of$\\ |
|
2845 & & $\;\;\Some \; (s, n) \Rightarrow |
|
2846 (r\backslash_{rsimps} s)\cdot r^{\{n\}}$\\ |
|
2847 & & $\;\;\None \Rightarrow |
|
2848 \ZERO$\\ |
|
2849 \end{tabular} |
|
2850 \end{center} |
|
2851 |
|
2852 |
|
2853 For a list of |
|
2854 $\textit{option} \;(\textit{string}, \textit{nat})$ elements, |
|
2855 we define the highest power for it recursively: |
|
2856 \begin{center} |
|
2857 \begin{tabular}{lcl} |
|
2858 $\hpa \; [] \; n $ & $\dn$ & $n$\\ |
|
2859 $\hpa \; (\None :: os) \; n $ & $\dn$ & $\hpa \; os \; n$\\ |
|
2860 $\hpa \; (\Some \; (s, n) :: os) \; m$ & $\dn$ & |
|
2861 $\hpa \;os \; (\textit{max} \; n\; m)$\\ |
|
2862 \\ |
|
2863 $\hpower \; rs $ & $\dn$ & $\hpa \; rs \; 0$\\ |
|
2864 \end{tabular} |
|
2865 \end{center} |
|
2866 |
|
2867 Now the intuition that an NTIMES regular expression's power |
|
2868 does not increase can be easily expressed as |
|
2869 \begin{lemma}\label{nupdatesMono2} |
|
2870 $\hpower \; (\nupdates \;s \; r \; [\Some \; ([c], n)]) \leq n$ |
|
2871 \end{lemma} |
|
2872 \begin{proof} |
|
2873 Note that the power is non-increasing after a $\nupdate$ application: |
|
2874 \begin{center} |
|
2875 $\hpa \;\; (\nupdate \; c \; r \; Ss)\;\; m \leq |
|
2876 \hpa\; \; Ss \; m$. |
|
2877 \end{center} |
|
2878 This is also the case for $\nupdates$: |
|
2879 \begin{center} |
|
2880 $\hpa \;\; (\nupdates \; s \; r \; Ss)\;\; m \leq |
|
2881 \hpa\; \; Ss \; m$. |
|
2882 \end{center} |
|
2883 Therefore we have that |
|
2884 \begin{center} |
|
2885 $\hpower \;\; (\nupdates \; s \; r \; Ss) \leq |
|
2886 \hpower \;\; Ss$ |
|
2887 \end{center} |
|
2888 which leads to the lemma being proven. |
|
2889 |
|
2890 \end{proof} |
|
2891 |
|
2892 |
|
2893 We also define the inductive rules for |
|
2894 the shape of derivatives of the NTIMES regular expressions:\\[-3em] |
|
2895 \begin{center} |
|
2896 \begin{mathpar} |
|
2897 \inferrule{\mbox{}}{\cbn \;\ZERO} |
|
2898 |
|
2899 \inferrule{\mbox{}}{\cbn \; \; r_a \cdot (r^{\{n\}})} |
|
2900 |
|
2901 \inferrule{\cbn \; r_1 \;\; \; \cbn \; r_2}{\cbn \; r_1 + r_2} |
|
2902 |
|
2903 \inferrule{\cbn \; r}{\cbn \; r + \ZERO} |
|
2904 \end{mathpar} |
|
2905 \end{center} |
|
2906 \noindent |
|
2907 A derivative of NTIMES fits into the shape described by $\cbn$: |
|
2908 \begin{lemma}\label{ntimesDersCbn} |
|
2909 $\cbn \; ((r' \cdot r^{\{n\}}) \backslash_{rs} s)$ holds. |
|
2910 \end{lemma} |
|
2911 \begin{proof} |
|
2912 By a reverse induction on $s$. |
|
2913 For the inductive case, note that if $\cbn \; r$ holds, |
|
2914 then $\cbn \; (r\backslash_r c)$ holds. |
|
2915 \end{proof} |
|
2916 \noindent |
|
2917 In addition, for $\cbn$-shaped regular expressioins, one can flatten |
|
2918 them: |
|
2919 \begin{lemma}\label{ntimesHfauPushin} |
|
2920 If $\cbn \; r$ holds, then $\hflataux{r \backslash_r c} = |
|
2921 \textit{concat} \; (\map \; \hflataux{\map \; (\_\backslash_r c) \; |
|
2922 (\hflataux{r})})$ |
|
2923 \end{lemma} |
|
2924 \begin{proof} |
|
2925 By an induction on the inductive cases of $\cbn$. |
|
2926 \end{proof} |
|
2927 \noindent |
|
2928 This time we do not need to define the flattening functions for NTIMES only, |
|
2929 because $\hflat{\_}$ and $\hflataux{\_}$ work on NTIMES already. |
|
2930 \begin{lemma}\label{ntimesHfauInduct} |
|
2931 $\hflataux{( (r\backslash_r c) \cdot r^{\{n\}}) \backslash_{rsimps} s} = |
|
2932 \map \; (\opterm \; r) \; (\nupdates \; s \; r \; [\Some \; ([c], n)])$ |
|
2933 \end{lemma} |
|
2934 \begin{proof} |
|
2935 By a reverse induction on $s$. |
|
2936 The lemmas \ref{ntimesHfauPushin} and \ref{ntimesDersCbn} are used. |
|
2937 \end{proof} |
|
2938 \noindent |
|
2939 We have a recursive property for NTIMES with $\nupdate$ |
|
2940 similar to that for STAR, |
|
2941 and one for $\nupdates $ as well: |
|
2942 \begin{lemma}\label{nupdateInduct1} |
|
2943 \mbox{} |
|
2944 \begin{itemize} |
|
2945 \item |
|
2946 \begin{center} |
|
2947 $\textit{concat} \; (\map \; (\hflataux{\_} \circ ( |
|
2948 \opterm \; r)) \; Ss) = \map \; (\opterm \; r) \; (\nupdate \; |
|
2949 c \; r \; Ss)$\\ |
|
2950 \end{center} |
|
2951 holds. |
|
2952 \item |
|
2953 \begin{center} |
|
2954 $\textit{concat} \; (\map \; \hflataux{\_}\; |
|
2955 \map \; (\_\backslash_r x) \; |
|
2956 (\map \; (\opterm \; r) \; (\nupdates \; xs \; r \; Ss)))$\\ |
|
2957 $=$\\ |
|
2958 $\map \; (\opterm \; r) \; (\nupdates \;(xs@[x]) \; r\;Ss)$ |
|
2959 \end{center} |
|
2960 holds. |
|
2961 \end{itemize} |
|
2962 \end{lemma} |
|
2963 \begin{proof} |
|
2964 (i) is by an induction on $Ss$. |
|
2965 (ii) is by an induction on $xs$. |
|
2966 \end{proof} |
|
2967 \noindent |
|
2968 The $\nString$ predicate is defined for conveniently |
|
2969 expressing that there are no empty strings in the |
|
2970 $\Some \;(s, n)$ elements generated by $\nupdate$: |
|
2971 \begin{center} |
|
2972 \begin{tabular}{lcl} |
|
2973 $\nString \; \None$ & $\dn$ & $ \textit{true}$\\ |
|
2974 $\nString \; (\Some \; ([], n))$ & $\dn$ & $ \textit{false}$\\ |
|
2975 $\nString \; (\Some \; (c::s, n))$ & $\dn$ & $ \textit{true}$\\ |
|
2976 \end{tabular} |
|
2977 \end{center} |
|
2978 \begin{lemma}\label{nupdatesNonempty} |
|
2979 If for all elements $o \in \textit{set} \; Ss$, |
|
2980 $\nString \; o$ holds, the we have that |
|
2981 for all elements $o' \in \textit{set} \; (\nupdates \; s \; r \; Ss)$, |
|
2982 $\nString \; o'$ holds. |
|
2983 \end{lemma} |
|
2984 \begin{proof} |
|
2985 By an induction on $s$, where $Ss$ is set to vary over all possible values. |
|
2986 \end{proof} |
|
2987 |
|
2988 \noindent |
|
2989 |
|
2990 \begin{lemma}\label{ntimesClosedFormsSteps} |
|
2991 The following list of equalities or rewriting relations hold:\\ |
|
2992 (i) $r^{\{n+1\}} \backslash_{rsimps} (c::s) = |
|
2993 \textit{rsimp} \; (\sum (\map \; (\opterm \;r \;\_) \; (\nupdates \; |
|
2994 s \; r \; [\Some \; ([c], n)])))$\\ |
|
2995 (ii) |
|
2996 \begin{center} |
|
2997 $\sum (\map \; (\opterm \; r) \; (\nupdates \; s \; r \; [ |
|
2998 \Some \; ([c], n)]))$ \\ $ \sequal$\\ |
|
2999 $\sum (\map \; (\textit{rsimp} \circ (\opterm \; r))\; (\nupdates \; |
|
3000 s\;r \; [\Some \; ([c], n)]))$\\ |
|
3001 \end{center} |
|
3002 (iii) |
|
3003 \begin{center} |
|
3004 $\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \; |
|
3005 ([c], n)]))$\\ |
|
3006 $\sequal$\\ |
|
3007 $\sum \;(\map \; (\optermsimp r) \; (\nupdates \; s \; r \; [\Some \; |
|
3008 ([c], n)])) $\\ |
|
3009 \end{center} |
|
3010 (iv) |
|
3011 \begin{center} |
|
3012 $\sum \;(\map \; (\optermosimp \; r) \; (\nupdates \; s \; r\; [\Some \; |
|
3013 ([c], n)])) $ \\ $\sequal$\\ |
|
3014 $\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \; |
|
3015 ([c], n)])) $\\ |
|
3016 \end{center} |
|
3017 (v) |
|
3018 \begin{center} |
|
3019 $\sum \;(\map \; (\optermOsimp r) \; (\nupdates \; s \; r \; [\Some \; |
|
3020 ([c], n)])) $ \\ $\sequal$\\ |
|
3021 $\sum \; (\map \; (\textit{rsimp} \circ (\opterm \; r)) \; |
|
3022 (\nupdates \; s \; r \; [\Some \; ([c], n)]))$ |
|
3023 \end{center} |
|
3024 \end{lemma} |
|
3025 \begin{proof} |
|
3026 Routine. |
|
3027 (iii) and (iv) make use of the fact that all the strings $s$ |
|
3028 inside $\Some \; (s, m)$ which are elements of the list |
|
3029 $\nupdates \; s\;r\;[\Some\; ([c], n)]$ are non-empty, |
|
3030 which is from lemma \ref{nupdatesNonempty}. |
|
3031 Once the string in $o = \Some \; (s, n)$ is |
|
3032 nonempty, $\optermsimp \; r \;o$, |
|
3033 $\optermosimp \; r \; o$ and $\optermosimp \; \; o$ are guaranteed |
|
3034 to be equal. |
|
3035 (v) uses \ref{nupdateInduct1}. |
|
3036 \end{proof} |
|
3037 \noindent |
|
3038 Now we are ready to present the closed form for NTIMES: |
|
3039 \begin{theorem}\label{ntimesClosedForm} |
|
3040 The derivative of $r^{\{n+1\}}$ can be described as an alternative |
|
3041 containing a list |
|
3042 of terms:\\ |
|
3043 $r^{\{n+1\}} \backslash_{rsimps} (c::s) = \textit{rsimp} \; ( |
|
3044 \sum (\map \; (\optermsimp \; r) \; (\nupdates \; s \; r \; |
|
3045 [\Some \; ([c], n)])))$ |
|
3046 \end{theorem} |
|
3047 \begin{proof} |
|
3048 By the rewriting steps described in lemma \ref{ntimesClosedFormsSteps}. |
|
3049 \end{proof} |
|
3050 \noindent |
|
3051 The key observation for bounding this closed form |
|
3052 is that the counter on $r^{\{n\}}$ will |
|
3053 only decrement during derivatives: |
|
3054 \begin{lemma}\label{nupdatesNLeqN} |
|
3055 For an element $o$ in $\textit{set} \; (\nupdates \; s \; r \; |
|
3056 [\Some \; ([c], n)])$, either $o = \None$, or $o = \Some |
|
3057 \; (s', m)$ for some string $s'$ and number $m \leq n$. |
|
3058 \end{lemma} |
|
3059 \noindent |
|
3060 The proof is routine and therefore omitted. |
|
3061 This allows us to say what kind of terms |
|
3062 are in the list $\textit{set} \; (\map \; (\optermsimp \; r) \; ( |
|
3063 \nupdates \; s \; r \; [\Some \; ([c], n)]))$: |
|
3064 only $\ZERO_r$s or a sequence with the tail an $r^{\{m\}}$ |
|
3065 with a small $m$: |
|
3066 \begin{lemma}\label{ntimesClosedFormListElemShape} |
|
3067 For any element $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; ( |
|
3068 \nupdates \; s \; r \; [\Some \; ([c], n)]))$, |
|
3069 we have that $r'$ is either $\ZERO$ or $r \backslash_{rsimps} s' \cdot |
|
3070 r^{\{m\}}$ for some string $s'$ and number $m \leq n$. |
|
3071 \end{lemma} |
|
3072 \begin{proof} |
|
3073 Using lemma \ref{nupdatesNLeqN}. |
|
3074 \end{proof} |
|
3075 |
|
3076 \begin{theorem}\label{ntimesClosedFormBounded} |
|
3077 Assuming that for any string $s$, $\llbracket r \backslash_{rsimps} s |
|
3078 \rrbracket_r \leq N$ holds, then we have that\\ |
|
3079 $\llbracket r^{\{n+1\}} \backslash_{rsimps} s \rrbracket_r \leq |
|
3080 \textit{max} \; (c_N+1)* (N + \llbracket r^{\{n\}} \rrbracket+1)$, |
|
3081 where $c_N = \textit{card} \; (\textit{sizeNregex} \; ( |
|
3082 N + \llbracket r^{\{n\}} \rrbracket_r+1))$. |
|
3083 \end{theorem} |
|
3084 \begin{proof} |
|
3085 We have that for all regular expressions $r'$ in $\textit{set} \; (\map \; (\optermsimp \; r) \; ( |
|
3086 \nupdates \; s \; r \; [\Some \; ([c], n)]))$, |
|
3087 $r'$'s size is less than or equal to $N + \llbracket r^{\{n\}} |
|
3088 \rrbracket_r + 1$ |
|
3089 because $r'$ can only be either a $\ZERO$ or $r \backslash_{rsimps} s' \cdot |
|
3090 r^{\{m\}}$ for some string $s'$ and number |
|
3091 $m \leq n$ (lemma \ref{ntimesClosedFormListElemShape}). |
|
3092 In addition, we know that the list |
|
3093 $\map \; (\optermsimp \; r) \; ( |
|
3094 \nupdates \; s \; r \; [\Some \; ([c], n)])$'s size is at most |
|
3095 $c_N = \textit{card} \; |
|
3096 (\sizeNregex \; ((N + \llbracket r^{\{n\}} \rrbracket) + 1))$. |
|
3097 This gives us $\llbracket r \backslash_{rsimps} \;s \rrbracket_r |
|
3098 \leq N * c_N$. |
|
3099 \end{proof} |
|
3100 |
|
3101 We aim to formalise the correctness and size bound |
|
3102 for constructs like $r^{\{\ldots n\}}$, $r^{\{n \ldots\}}$ |
|
3103 and so on, which is still work in progress. |
|
3104 They should more or less follow the same recipe described in this section. |
|
3105 Once we know about how to deal with them recursively using suitable auxiliary |
|
3106 definitions, we are able to routinely establish the proofs. |
2594 |
3107 |
2595 |
3108 |
2596 %---------------------------------------------------------------------------------------- |
3109 %---------------------------------------------------------------------------------------- |
2597 % SECTION 3 |
3110 % SECTION 3 |
2598 %---------------------------------------------------------------------------------------- |
3111 %---------------------------------------------------------------------------------------- |