63 "match r s \<equiv> nullable (ders s r)" |
63 "match r s \<equiv> nullable (ders s r)" |
64 |
64 |
65 (* |
65 (* |
66 comments not implemented |
66 comments not implemented |
67 |
67 |
68 p9. "None is returned, indicating an error is raised" : there is no error |
|
69 |
|
70 p9. The condtion "not exists s3 s4..." appears often enough (in particular in |
68 p9. The condtion "not exists s3 s4..." appears often enough (in particular in |
71 the proof of Lemma 3) to warrant a definition. |
69 the proof of Lemma 3) to warrant a definition. |
72 |
|
73 p9. "The POSIX value v" : at this point the existence of a POSIX value is yet |
|
74 to be shown |
|
75 |
70 |
76 p10. (proof Lemma 3) : separating the cases with description/itemize would greatly |
71 p10. (proof Lemma 3) : separating the cases with description/itemize would greatly |
77 improve readability |
72 improve readability |
78 |
73 |
79 p11. Theorem 2(2) : Stressing the uniqueness is strange, given that it follows |
74 p11. Theorem 2(2) : Stressing the uniqueness is strange, given that it follows |
80 trivially from the fact that "lexer" is a function. Maybe state the existence of |
75 trivially from the fact that "lexer" is a function. Maybe state the existence of |
81 a unique POSIX value as corollary. |
76 a unique POSIX value as corollary. |
82 |
|
83 p14. Proposition 3 is evidently false and Proposition 4 is a conjecture. |
|
84 I find it confusing that Propositions 1 and 2, which are proven facts, are also called propositions. |
|
85 |
77 |
86 *) |
78 *) |
87 |
79 |
88 (*>*) |
80 (*>*) |
89 |
81 |
600 & & $|$ @{term "Some v"} @{text "\<Rightarrow>"} @{term "Some (injval r c v)"} |
592 & & $|$ @{term "Some v"} @{text "\<Rightarrow>"} @{term "Some (injval r c v)"} |
601 \end{tabular} |
593 \end{tabular} |
602 \end{center} |
594 \end{center} |
603 |
595 |
604 \noindent If the regular expression does not match the string, @{const None} is |
596 \noindent If the regular expression does not match the string, @{const None} is |
605 returned, indicating an error is raised. If the regular expression \emph{does} |
597 returned. If the regular expression \emph{does} |
606 match the string, then @{const Some} value is returned. One important |
598 match the string, then @{const Some} value is returned. One important |
607 virtue of this algorithm is that it can be implemented with ease in any |
599 virtue of this algorithm is that it can be implemented with ease in any |
608 functional programming language and also in Isabelle/HOL. In the remaining |
600 functional programming language and also in Isabelle/HOL. In the remaining |
609 part of this section we prove that this algorithm is correct. |
601 part of this section we prove that this algorithm is correct. |
610 |
602 |
642 @{thm (prem 4) Posix.intros(6)[of "s\<^sub>1" "r" "v" "s\<^sub>2" "vs"]}} |
634 @{thm (prem 4) Posix.intros(6)[of "s\<^sub>1" "r" "v" "s\<^sub>2" "vs"]}} |
643 {@{thm (concl) Posix.intros(6)[of "s\<^sub>1" "r" "v" "s\<^sub>2" "vs"]}}$@{text "P\<star>"} |
635 {@{thm (concl) Posix.intros(6)[of "s\<^sub>1" "r" "v" "s\<^sub>2" "vs"]}}$@{text "P\<star>"} |
644 \end{tabular} |
636 \end{tabular} |
645 \end{center} |
637 \end{center} |
646 |
638 |
647 \noindent We claim that this relation captures the idea behind the two |
639 \noindent |
|
640 We can prove that given a string @{term s} and regular expression @{term |
|
641 r}, the POSIX value @{term v} is uniquely determined by @{term "s \<in> r \<rightarrow> v"}. |
|
642 |
|
643 \begin{theorem} |
|
644 @{thm[mode=IfThen] Posix_determ(1)[of _ _ "v\<^sub>1" "v\<^sub>2"]} |
|
645 \end{theorem} |
|
646 |
|
647 \begin{proof} By induction on the definition of @{term "s \<in> r \<rightarrow> v\<^sub>1"} and |
|
648 a case analysis of @{term "s \<in> r \<rightarrow> v\<^sub>2"}. This proof requires the |
|
649 auxiliary lemma that @{thm (prem 1) Posix1(1)} implies @{thm (concl) |
|
650 Posix1(1)} and @{thm (concl) Posix1(2)}, which are both easily |
|
651 established by inductions.\qed \end{proof} |
|
652 |
|
653 |
|
654 We claim that our @{term "s \<in> r \<rightarrow> v"} relation captures the idea behind the two |
648 informal POSIX rules shown in the Introduction: Consider for example the |
655 informal POSIX rules shown in the Introduction: Consider for example the |
649 rules @{text "P+L"} and @{text "P+R"} where the POSIX value for a string |
656 rules @{text "P+L"} and @{text "P+R"} where the POSIX value for a string |
650 and an alternative regular expression, that is @{term "(s, ALT r\<^sub>1 r\<^sub>2)"}, |
657 and an alternative regular expression, that is @{term "(s, ALT r\<^sub>1 r\<^sub>2)"}, |
651 is specified---it is always a @{text "Left"}-value, \emph{except} when the |
658 is specified---it is always a @{text "Left"}-value, \emph{except} when the |
652 string to be matched is not in the language of @{term "r\<^sub>1"}; only then it |
659 string to be matched is not in the language of @{term "r\<^sub>1"}; only then it |
675 @{term v} cannot be flattened to the empty string. In effect, we require |
682 @{term v} cannot be flattened to the empty string. In effect, we require |
676 that in each ``iteration'' of the star, some non-empty substring needs to |
683 that in each ``iteration'' of the star, some non-empty substring needs to |
677 be ``chipped'' away; only in case of the empty string we accept @{term |
684 be ``chipped'' away; only in case of the empty string we accept @{term |
678 "Stars []"} as the POSIX value. |
685 "Stars []"} as the POSIX value. |
679 |
686 |
680 We can prove that given a string @{term s} and regular expression @{term |
|
681 r}, the POSIX value @{term v} is uniquely determined by @{term "s \<in> r \<rightarrow> |
|
682 v"}. |
|
683 |
|
684 \begin{theorem} |
|
685 @{thm[mode=IfThen] Posix_determ(1)[of _ _ "v\<^sub>1" "v\<^sub>2"]} |
|
686 \end{theorem} |
|
687 |
|
688 \begin{proof} By induction on the definition of @{term "s \<in> r \<rightarrow> v\<^sub>1"} and |
|
689 a case analysis of @{term "s \<in> r \<rightarrow> v\<^sub>2"}. This proof requires the |
|
690 auxiliary lemma that @{thm (prem 1) Posix1(1)} implies @{thm (concl) |
|
691 Posix1(1)} and @{thm (concl) Posix1(2)}, which are both easily |
|
692 established by inductions.\qed \end{proof} |
|
693 |
|
694 \noindent |
|
695 Next is the lemma that shows the function @{term "mkeps"} calculates |
687 Next is the lemma that shows the function @{term "mkeps"} calculates |
696 the POSIX value for the empty string and a nullable regular expression. |
688 the POSIX value for the empty string and a nullable regular expression. |
697 |
689 |
698 \begin{lemma}\label{lemmkeps} |
690 \begin{lemma}\label{lemmkeps} |
699 @{thm[mode=IfThen] Posix_mkeps} |
691 @{thm[mode=IfThen] Posix_mkeps} |
992 v\<^sub>2) \<and> (flat v\<^sub>1 = flat (v\<^sub>2::val))"}. This means that @{term "v\<^sub>1 |
984 v\<^sub>2) \<and> (flat v\<^sub>1 = flat (v\<^sub>2::val))"}. This means that @{term "v\<^sub>1 |
993 >(r::rexp) (v\<^sub>2::val)"} does not necessarily imply @{term "v\<^sub>1 \<ge>(r::rexp) |
985 >(r::rexp) (v\<^sub>2::val)"} does not necessarily imply @{term "v\<^sub>1 \<ge>(r::rexp) |
994 (v\<^sub>2::val)"}. Moreover, transitivity does not hold in the ``usual'' |
986 (v\<^sub>2::val)"}. Moreover, transitivity does not hold in the ``usual'' |
995 formulation, for example: |
987 formulation, for example: |
996 |
988 |
997 \begin{proposition} |
989 \begin{falsehood} |
998 Suppose @{term "\<turnstile> v\<^sub>1 : r"}, @{term "\<turnstile> v\<^sub>2 : r"} and @{term "\<turnstile> v\<^sub>3 : r"}. |
990 Suppose @{term "\<turnstile> v\<^sub>1 : r"}, @{term "\<turnstile> v\<^sub>2 : r"} and @{term "\<turnstile> v\<^sub>3 : r"}. |
999 If @{term "v\<^sub>1 >(r::rexp) (v\<^sub>2::val)"} and @{term "v\<^sub>2 >(r::rexp) (v\<^sub>3::val)"} |
991 If @{term "v\<^sub>1 >(r::rexp) (v\<^sub>2::val)"} and @{term "v\<^sub>2 >(r::rexp) (v\<^sub>3::val)"} |
1000 then @{term "v\<^sub>1 >(r::rexp) (v\<^sub>3::val)"}. |
992 then @{term "v\<^sub>1 >(r::rexp) (v\<^sub>3::val)"}. |
1001 \end{proposition} |
993 \end{falsehood} |
1002 |
994 |
1003 \noindent If formulated in this way, then there are various counter |
995 \noindent If formulated in this way, then there are various counter |
1004 examples: For example let @{term r} be @{text "a + ((a + a)\<cdot>(a + \<zero>))"} |
996 examples: For example let @{term r} be @{text "a + ((a + a)\<cdot>(a + \<zero>))"} |
1005 then the @{term "v\<^sub>1"}, @{term "v\<^sub>2"} and @{term "v\<^sub>3"} below are values |
997 then the @{term "v\<^sub>1"}, @{term "v\<^sub>2"} and @{term "v\<^sub>3"} below are values |
1006 of @{term r}: |
998 of @{term r}: |
1025 property. They require in addition that the underlying strings are of the |
1017 property. They require in addition that the underlying strings are of the |
1026 same length. This excludes the counter example above and any |
1018 same length. This excludes the counter example above and any |
1027 counter-example we were able to find (by hand and by machine). Thus the |
1019 counter-example we were able to find (by hand and by machine). Thus the |
1028 transitivity lemma should be formulated as: |
1020 transitivity lemma should be formulated as: |
1029 |
1021 |
1030 \begin{proposition} |
1022 \begin{conject} |
1031 Suppose @{term "\<turnstile> v\<^sub>1 : r"}, @{term "\<turnstile> v\<^sub>2 : r"} and @{term "\<turnstile> v\<^sub>3 : r"}, |
1023 Suppose @{term "\<turnstile> v\<^sub>1 : r"}, @{term "\<turnstile> v\<^sub>2 : r"} and @{term "\<turnstile> v\<^sub>3 : r"}, |
1032 and also @{text "|v\<^sub>1| = |v\<^sub>2| = |v\<^sub>3|"}.\\ |
1024 and also @{text "|v\<^sub>1| = |v\<^sub>2| = |v\<^sub>3|"}.\\ |
1033 If @{term "v\<^sub>1 >(r::rexp) (v\<^sub>2::val)"} and @{term "v\<^sub>2 >(r::rexp) (v\<^sub>3::val)"} |
1025 If @{term "v\<^sub>1 >(r::rexp) (v\<^sub>2::val)"} and @{term "v\<^sub>2 >(r::rexp) (v\<^sub>3::val)"} |
1034 then @{term "v\<^sub>1 >(r::rexp) (v\<^sub>3::val)"}. |
1026 then @{term "v\<^sub>1 >(r::rexp) (v\<^sub>3::val)"}. |
1035 \end{proposition} |
1027 \end{conject} |
1036 |
1028 |
1037 \noindent While we agree with Sulzmann and Lu that this property |
1029 \noindent While we agree with Sulzmann and Lu that this property |
1038 probably(!) holds, proving it seems not so straightforward: although one |
1030 probably(!) holds, proving it seems not so straightforward: although one |
1039 begins with the assumption that the values have the same flattening, this |
1031 begins with the assumption that the values have the same flattening, this |
1040 cannot be maintained as one descends into the induction. This is a problem |
1032 cannot be maintained as one descends into the induction. This is a problem |