329 \begin{tabular}{c} |
329 \begin{tabular}{c} |
330 \\[-8mm] |
330 \\[-8mm] |
331 @{thm[mode=Axiom] Prf.intros(4)} \qquad |
331 @{thm[mode=Axiom] Prf.intros(4)} \qquad |
332 @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\\[4mm] |
332 @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\\[4mm] |
333 @{thm[mode=Rule] Prf.intros(2)[of "v\<^sub>1" "r\<^sub>1" "r\<^sub>2"]} \qquad |
333 @{thm[mode=Rule] Prf.intros(2)[of "v\<^sub>1" "r\<^sub>1" "r\<^sub>2"]} \qquad |
334 @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\\[4mm] |
334 @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>2" "r\<^sub>1"]}\\[4mm] |
335 @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]} \qquad |
335 @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]} \qquad |
336 @{thm[mode=Rule] Prf.intros(6)[of "vs" "r"]} |
336 @{thm[mode=Rule] Prf.intros(6)[of "vs" "r"]} |
337 \end{tabular} |
337 \end{tabular} |
338 \end{center} |
338 \end{center} |
339 |
339 |
680 bitcoded regular expressions. |
680 bitcoded regular expressions. |
681 % |
681 % |
682 \begin{center} |
682 \begin{center} |
683 \begin{tabular}{@ {}c@ {}c@ {}} |
683 \begin{tabular}{@ {}c@ {}c@ {}} |
684 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} |
684 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} |
685 $\textit{bnullable}\,(\textit{ZERO})$ & $\dn$ & $\textit{false}$\\ |
685 $\textit{bnullable}\,(\textit{ZERO})$ & $\dn$ & $\textit{False}$\\ |
686 $\textit{bnullable}\,(\textit{ONE}\,bs)$ & $\dn$ & $\textit{true}$\\ |
686 $\textit{bnullable}\,(\textit{ONE}\,bs)$ & $\dn$ & $\textit{True}$\\ |
687 $\textit{bnullable}\,(\textit{CHAR}\,bs\,c)$ & $\dn$ & $\textit{false}$\\ |
687 $\textit{bnullable}\,(\textit{CHAR}\,bs\,c)$ & $\dn$ & $\textit{False}$\\ |
688 $\textit{bnullable}\,(\textit{ALTs}\,bs\,\rs)$ & $\dn$ & |
688 $\textit{bnullable}\,(\textit{ALTs}\,bs\,\rs)$ & $\dn$ & |
689 $\exists\, r \in \rs. \,\textit{bnullable}\,r$\\ |
689 $\exists\, r \in \rs. \,\textit{bnullable}\,r$\\ |
690 $\textit{bnullable}\,(\textit{SEQ}\,bs\,r_1\,r_2)$ & $\dn$ & |
690 $\textit{bnullable}\,(\textit{SEQ}\,bs\,r_1\,r_2)$ & $\dn$ & |
691 $\textit{bnullable}\,r_1\wedge \textit{bnullable}\,r_2$\\ |
691 $\textit{bnullable}\,r_1\wedge \textit{bnullable}\,r_2$\\ |
692 $\textit{bnullable}\,(\textit{STAR}\,bs\,r)$ & $\dn$ & |
692 $\textit{bnullable}\,(\textit{STAR}\,bs\,r)$ & $\dn$ & |
693 $\textit{true}$ |
693 $\textit{True}$ |
694 \end{tabular} |
694 \end{tabular} |
695 & |
695 & |
696 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1mm}}l@ {}} |
696 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1mm}}l@ {}} |
697 $\textit{bmkeps}\,(\textit{ONE}\,bs)$ & $\dn$ & $bs$\\ |
697 $\textit{bmkeps}\,(\textit{ONE}\,bs)$ & $\dn$ & $bs$\\ |
698 $\textit{bmkeps}\,(\textit{ALTs}\,bs\,\rs)$ & $\dn$ & |
698 $\textit{bmkeps}\,(\textit{ALTs}\,bs\,\rs)$ & $\dn$ & |
710 \end{center} |
710 \end{center} |
711 |
711 |
712 |
712 |
713 \noindent |
713 \noindent |
714 The key function in the bitcoded algorithm is the derivative of a |
714 The key function in the bitcoded algorithm is the derivative of a |
715 bitcoded regular expression. This derivative calculates the |
715 bitcoded regular expression. This derivative function calculates the |
716 derivative but at the same time also the incremental part of the bitsequences |
716 derivative but at the same time also the incremental part of the bitsequences |
717 that contribute to constructing a POSIX value. |
717 that contribute to constructing a POSIX value. |
718 |
718 |
719 \begin{center} |
719 \begin{center} |
720 \begin{tabular}{@ {}lcl@ {}} |
720 \begin{tabular}{@ {}lcl@ {}} |
798 is the same as if we first erase the bitcoded regular expression and |
798 is the same as if we first erase the bitcoded regular expression and |
799 then perform the ``standard'' derivative operation. |
799 then perform the ``standard'' derivative operation. |
800 |
800 |
801 \begin{lemma}\label{bnullable}\mbox{}\smallskip\\ |
801 \begin{lemma}\label{bnullable}\mbox{}\smallskip\\ |
802 \begin{tabular}{ll} |
802 \begin{tabular}{ll} |
803 \textit{(1)} & $(a\backslash s)^\downarrow = (a^\downarrow)\backslash s$\\ |
803 \textit{(1)} & $(r\backslash s)^\downarrow = (r^\downarrow)\backslash s$\\ |
804 \textit{(2)} & $\textit{bnullable}(a)$ iff $\textit{nullable}(a^\downarrow)$\\ |
804 \textit{(2)} & $\textit{bnullable}(r)$ iff $\textit{nullable}(r^\downarrow)$\\ |
805 \textit{(3)} & $\textit{bmkeps}(a) = \textit{retrieve}\,a\,(\textit{mkeps}\,(a^\downarrow))$ provided $\textit{nullable}(a^\downarrow)$. |
805 \textit{(3)} & $\textit{bmkeps}(r) = \textit{retrieve}\,r\,(\textit{mkeps}\,(r^\downarrow))$ provided $\textit{nullable}(r^\downarrow)$. |
806 \end{tabular} |
806 \end{tabular} |
807 \end{lemma} |
807 \end{lemma} |
808 |
808 |
809 \begin{proof} |
809 \begin{proof} |
810 All properties are by induction on annotated regular expressions. There are no |
810 All properties are by induction on annotated regular expressions. There are no |
921 For the \textit{if}-branch suppose $r_d \dn r^\uparrow\backslash s$ and |
921 For the \textit{if}-branch suppose $r_d \dn r^\uparrow\backslash s$ and |
922 $d \dn r\backslash s$. We have (*) @{text "nullable d"}. We can then show |
922 $d \dn r\backslash s$. We have (*) @{text "nullable d"}. We can then show |
923 by Lemma~\ref{bnullable}\textit{(3)} that |
923 by Lemma~\ref{bnullable}\textit{(3)} that |
924 % |
924 % |
925 \[ |
925 \[ |
926 \textit{decode}(\textit{bmkeps}\,r_d)\,r = |
926 \textit{decode}(\textit{bmkeps}\:r_d)\,r = |
927 \textit{decode}(\textit{retrieve}\,a\,(\textit{mkeps}\,d))\,r |
927 \textit{decode}(\textit{retrieve}\,r_d\,(\textit{mkeps}\,d))\,r |
928 \] |
928 \] |
929 |
929 |
930 \noindent |
930 \noindent |
931 where the right-hand side is equal to |
931 where the right-hand side is equal to |
932 $\textit{Some}\,(\textit{flex}\,r\,\textit{id}\,s\,(\textit{mkeps}\, |
932 $\textit{Some}\,(\textit{flex}\,r\,\textit{id}\,s\,(\textit{mkeps}\, |
1012 aggressive simplification rules described by Sulzmann and Lu. Recasting |
1012 aggressive simplification rules described by Sulzmann and Lu. Recasting |
1013 their definition with our syntax they define the step of removing |
1013 their definition with our syntax they define the step of removing |
1014 duplicates as |
1014 duplicates as |
1015 % |
1015 % |
1016 \[ @{text "bsimp (ALTs bs rs)"} \dn @{text "ALTs |
1016 \[ @{text "bsimp (ALTs bs rs)"} \dn @{text "ALTs |
1017 bs (nup (map bsimp rs))"} |
1017 bs (nub (map bsimp rs))"} |
1018 \] |
1018 \] |
1019 |
1019 |
1020 \noindent where they first recursively simplify the regular |
1020 \noindent where they first recursively simplify the regular |
1021 expressions in @{text rs} (using @{text map}) and then use |
1021 expressions in @{text rs} (using @{text map}) and then use |
1022 Haskell's @{text nub}-function to remove potential |
1022 Haskell's @{text nub}-function to remove potential |
1023 duplicates. While this makes sense when considering the example |
1023 duplicates. While this makes sense when considering the example |
1024 shown in \eqref{derivex}, @{text nub} is the inappropriate |
1024 shown in \eqref{derivex}, @{text nub} is the inappropriate |
1025 function in the case of bitcoded regular expressions. The reason |
1025 function in the case of bitcoded regular expressions. The reason |
1026 is that in general the elements in @{text rs} will have a |
1026 is that in general the elements in @{text rs} will have a |
1027 different annotated bitsequence and in this way @{text nub} |
1027 different annotated bitsequence and in this way @{text nub} |
1028 will never find a duplicate to be removed. The correct way to |
1028 will never find a duplicate to be removed. One correct way to |
1029 handle this situation is to first \emph{erase} the regular |
1029 handle this situation is to first \emph{erase} the regular |
1030 expressions when comparing potential duplicates. This is inspired |
1030 expressions when comparing potential duplicates. This is inspired |
1031 by Scala's list functions of the form \mbox{@{text "distinctBy rs f |
1031 by Scala's list functions of the form \mbox{@{text "distinctBy rs f |
1032 acc"}} where a function is applied first before two elements |
1032 acc"}} where a function is applied first before two elements |
1033 are compared. We define this function in Isabelle/HOL as |
1033 are compared. We define this function in Isabelle/HOL as |
1045 expressions---essentially a set of regular expressions that we have already seen |
1045 expressions---essentially a set of regular expressions that we have already seen |
1046 while scanning the list. Therefore we delete an element, say @{text x}, |
1046 while scanning the list. Therefore we delete an element, say @{text x}, |
1047 from the list provided @{text "f x"} is already in the accumulator; |
1047 from the list provided @{text "f x"} is already in the accumulator; |
1048 otherwise we keep @{text x} and scan the rest of the list but |
1048 otherwise we keep @{text x} and scan the rest of the list but |
1049 add @{text "f x"} as another ``seen'' element to @{text acc}. We will use |
1049 add @{text "f x"} as another ``seen'' element to @{text acc}. We will use |
1050 @{term distinctBy} where @{text f} is the erase functions, @{term "erase (DUMMY)"}, |
1050 @{term distinctBy} where @{text f} is the erase function, @{term "erase (DUMMY)"}, |
1051 that deletes bitsequences from bitcoded regular expressions. |
1051 that deletes bitsequences from bitcoded regular expressions. |
1052 This is clearly a computationally more expensive operation, than @{text nub}, |
1052 This is clearly a computationally more expensive operation, than @{text nub}, |
1053 but is needed in order to make the removal of unnecessary copies |
1053 but is needed in order to make the removal of unnecessary copies |
1054 to work properly. |
1054 to work properly. |
1055 |
1055 |
1065 \end{tabular} |
1065 \end{tabular} |
1066 \end{center} |
1066 \end{center} |
1067 |
1067 |
1068 \noindent |
1068 \noindent |
1069 The second clause of @{text flts} removes all instances of @{text ZERO} in alternatives and |
1069 The second clause of @{text flts} removes all instances of @{text ZERO} in alternatives and |
1070 the second ``spills'' out nested alternatives (but retaining the |
1070 the third ``spills'' out nested alternatives (but retaining the |
1071 bitsequence @{text "bs'"} accumulated in the inner alternative). There are |
1071 bitsequence @{text "bs'"} accumulated in the inner alternative). There are |
1072 some corner cases to be considered when the resulting list inside an alternative is |
1072 some corner cases to be considered when the resulting list inside an alternative is |
1073 empty or a singleton list. We take care of those cases in the |
1073 empty or a singleton list. We take care of those cases in the |
1074 @{text "bsimpALTs"} function; similarly we define a helper function that simplifies |
1074 @{text "bsimpALTs"} function; similarly we define a helper function that simplifies |
1075 sequences according to the usual rules about @{text ZERO}s and @{text ONE}s: |
1075 sequences according to the usual rules about @{text ZERO}s and @{text ONE}s: |