thys/Paper/Paper.thy
changeset 174 4e3778f4a802
parent 173 80fe81a28a52
child 175 fc22ca36325c
equal deleted inserted replaced
173:80fe81a28a52 174:4e3778f4a802
    63   "match r s \<equiv> nullable (ders s r)"
    63   "match r s \<equiv> nullable (ders s r)"
    64 
    64 
    65 (*
    65 (*
    66 comments not implemented
    66 comments not implemented
    67 
    67 
    68 p9. "None is returned, indicating an error is raised" : there is no error
       
    69 
       
    70 p9. The condtion "not exists s3 s4..." appears often enough (in particular in
    68 p9. The condtion "not exists s3 s4..." appears often enough (in particular in
    71 the proof of Lemma 3) to warrant a definition.
    69 the proof of Lemma 3) to warrant a definition.
    72 
       
    73 p9. "The POSIX value v" : at this point the existence of a POSIX value is yet
       
    74 to be shown
       
    75 
    70 
    76 p10. (proof Lemma 3) : separating the cases with description/itemize would greatly
    71 p10. (proof Lemma 3) : separating the cases with description/itemize would greatly
    77 improve readability
    72 improve readability
    78 
    73 
    79 p11. Theorem 2(2) : Stressing the uniqueness is strange, given that it follows
    74 p11. Theorem 2(2) : Stressing the uniqueness is strange, given that it follows
    80 trivially from the fact that "lexer" is a function. Maybe state the existence of
    75 trivially from the fact that "lexer" is a function. Maybe state the existence of
    81 a unique POSIX value as corollary.
    76 a unique POSIX value as corollary.
    82 
       
    83 p14. Proposition 3 is evidently false and Proposition 4 is a conjecture.
       
    84 I find it confusing that Propositions 1 and 2, which are proven facts, are also called propositions.
       
    85 
    77 
    86 *)
    78 *)
    87 
    79 
    88 (*>*)
    80 (*>*)
    89 
    81 
   600                      & & $|$ @{term "Some v"} @{text "\<Rightarrow>"} @{term "Some (injval r c v)"}                          
   592                      & & $|$ @{term "Some v"} @{text "\<Rightarrow>"} @{term "Some (injval r c v)"}                          
   601   \end{tabular}
   593   \end{tabular}
   602   \end{center}
   594   \end{center}
   603 
   595 
   604   \noindent If the regular expression does not match the string, @{const None} is
   596   \noindent If the regular expression does not match the string, @{const None} is
   605   returned, indicating an error is raised. If the regular expression \emph{does}
   597   returned. If the regular expression \emph{does}
   606   match the string, then @{const Some} value is returned. One important
   598   match the string, then @{const Some} value is returned. One important
   607   virtue of this algorithm is that it can be implemented with ease in any
   599   virtue of this algorithm is that it can be implemented with ease in any
   608   functional programming language and also in Isabelle/HOL. In the remaining
   600   functional programming language and also in Isabelle/HOL. In the remaining
   609   part of this section we prove that this algorithm is correct.
   601   part of this section we prove that this algorithm is correct.
   610 
   602 
   642     @{thm (prem 4) Posix.intros(6)[of "s\<^sub>1" "r" "v" "s\<^sub>2" "vs"]}}
   634     @{thm (prem 4) Posix.intros(6)[of "s\<^sub>1" "r" "v" "s\<^sub>2" "vs"]}}
   643    {@{thm (concl) Posix.intros(6)[of "s\<^sub>1" "r" "v" "s\<^sub>2" "vs"]}}$@{text "P\<star>"}
   635    {@{thm (concl) Posix.intros(6)[of "s\<^sub>1" "r" "v" "s\<^sub>2" "vs"]}}$@{text "P\<star>"}
   644   \end{tabular}
   636   \end{tabular}
   645   \end{center}
   637   \end{center}
   646 
   638 
   647   \noindent We claim that this relation captures the idea behind the two
   639    \noindent
       
   640    We can prove that given a string @{term s} and regular expression @{term
       
   641    r}, the POSIX value @{term v} is uniquely determined by @{term "s \<in> r \<rightarrow> v"}.
       
   642 
       
   643   \begin{theorem}
       
   644   @{thm[mode=IfThen] Posix_determ(1)[of _ _ "v\<^sub>1" "v\<^sub>2"]}
       
   645   \end{theorem}
       
   646 
       
   647   \begin{proof} By induction on the definition of @{term "s \<in> r \<rightarrow> v\<^sub>1"} and
       
   648   a case analysis of @{term "s \<in> r \<rightarrow> v\<^sub>2"}. This proof requires the
       
   649   auxiliary lemma that @{thm (prem 1) Posix1(1)} implies @{thm (concl)
       
   650   Posix1(1)} and @{thm (concl) Posix1(2)}, which are both easily
       
   651   established by inductions.\qed \end{proof}
       
   652 
       
   653 
       
   654   We claim that our @{term "s \<in> r \<rightarrow> v"} relation captures the idea behind the two
   648   informal POSIX rules shown in the Introduction: Consider for example the
   655   informal POSIX rules shown in the Introduction: Consider for example the
   649   rules @{text "P+L"} and @{text "P+R"} where the POSIX value for a string
   656   rules @{text "P+L"} and @{text "P+R"} where the POSIX value for a string
   650   and an alternative regular expression, that is @{term "(s, ALT r\<^sub>1 r\<^sub>2)"},
   657   and an alternative regular expression, that is @{term "(s, ALT r\<^sub>1 r\<^sub>2)"},
   651   is specified---it is always a @{text "Left"}-value, \emph{except} when the
   658   is specified---it is always a @{text "Left"}-value, \emph{except} when the
   652   string to be matched is not in the language of @{term "r\<^sub>1"}; only then it
   659   string to be matched is not in the language of @{term "r\<^sub>1"}; only then it
   675   @{term v} cannot be flattened to the empty string. In effect, we require
   682   @{term v} cannot be flattened to the empty string. In effect, we require
   676   that in each ``iteration'' of the star, some non-empty substring needs to
   683   that in each ``iteration'' of the star, some non-empty substring needs to
   677   be ``chipped'' away; only in case of the empty string we accept @{term
   684   be ``chipped'' away; only in case of the empty string we accept @{term
   678   "Stars []"} as the POSIX value.
   685   "Stars []"} as the POSIX value.
   679 
   686 
   680    We can prove that given a string @{term s} and regular expression @{term
       
   681    r}, the POSIX value @{term v} is uniquely determined by @{term "s \<in> r \<rightarrow>
       
   682    v"}.
       
   683 
       
   684   \begin{theorem}
       
   685   @{thm[mode=IfThen] Posix_determ(1)[of _ _ "v\<^sub>1" "v\<^sub>2"]}
       
   686   \end{theorem}
       
   687 
       
   688   \begin{proof} By induction on the definition of @{term "s \<in> r \<rightarrow> v\<^sub>1"} and
       
   689   a case analysis of @{term "s \<in> r \<rightarrow> v\<^sub>2"}. This proof requires the
       
   690   auxiliary lemma that @{thm (prem 1) Posix1(1)} implies @{thm (concl)
       
   691   Posix1(1)} and @{thm (concl) Posix1(2)}, which are both easily
       
   692   established by inductions.\qed \end{proof}
       
   693 
       
   694   \noindent
       
   695   Next is the lemma that shows the function @{term "mkeps"} calculates
   687   Next is the lemma that shows the function @{term "mkeps"} calculates
   696   the POSIX value for the empty string and a nullable regular expression.
   688   the POSIX value for the empty string and a nullable regular expression.
   697 
   689 
   698   \begin{lemma}\label{lemmkeps}
   690   \begin{lemma}\label{lemmkeps}
   699   @{thm[mode=IfThen] Posix_mkeps}
   691   @{thm[mode=IfThen] Posix_mkeps}
   992   v\<^sub>2) \<and> (flat v\<^sub>1 = flat (v\<^sub>2::val))"}. This means that @{term "v\<^sub>1
   984   v\<^sub>2) \<and> (flat v\<^sub>1 = flat (v\<^sub>2::val))"}. This means that @{term "v\<^sub>1
   993   >(r::rexp) (v\<^sub>2::val)"} does not necessarily imply @{term "v\<^sub>1 \<ge>(r::rexp)
   985   >(r::rexp) (v\<^sub>2::val)"} does not necessarily imply @{term "v\<^sub>1 \<ge>(r::rexp)
   994   (v\<^sub>2::val)"}. Moreover, transitivity does not hold in the ``usual''
   986   (v\<^sub>2::val)"}. Moreover, transitivity does not hold in the ``usual''
   995   formulation, for example:
   987   formulation, for example:
   996 
   988 
   997   \begin{proposition}
   989   \begin{falsehood}
   998   Suppose @{term "\<turnstile> v\<^sub>1 : r"}, @{term "\<turnstile> v\<^sub>2 : r"} and @{term "\<turnstile> v\<^sub>3 : r"}.
   990   Suppose @{term "\<turnstile> v\<^sub>1 : r"}, @{term "\<turnstile> v\<^sub>2 : r"} and @{term "\<turnstile> v\<^sub>3 : r"}.
   999   If @{term "v\<^sub>1 >(r::rexp) (v\<^sub>2::val)"} and @{term "v\<^sub>2 >(r::rexp) (v\<^sub>3::val)"}
   991   If @{term "v\<^sub>1 >(r::rexp) (v\<^sub>2::val)"} and @{term "v\<^sub>2 >(r::rexp) (v\<^sub>3::val)"}
  1000   then @{term "v\<^sub>1 >(r::rexp) (v\<^sub>3::val)"}.
   992   then @{term "v\<^sub>1 >(r::rexp) (v\<^sub>3::val)"}.
  1001   \end{proposition}
   993   \end{falsehood}
  1002 
   994 
  1003   \noindent If formulated in this way, then there are various counter
   995   \noindent If formulated in this way, then there are various counter
  1004   examples: For example let @{term r} be @{text "a + ((a + a)\<cdot>(a + \<zero>))"}
   996   examples: For example let @{term r} be @{text "a + ((a + a)\<cdot>(a + \<zero>))"}
  1005   then the @{term "v\<^sub>1"}, @{term "v\<^sub>2"} and @{term "v\<^sub>3"} below are values
   997   then the @{term "v\<^sub>1"}, @{term "v\<^sub>2"} and @{term "v\<^sub>3"} below are values
  1006   of @{term r}:
   998   of @{term r}:
  1025   property. They require in addition that the underlying strings are of the
  1017   property. They require in addition that the underlying strings are of the
  1026   same length. This excludes the counter example above and any
  1018   same length. This excludes the counter example above and any
  1027   counter-example we were able to find (by hand and by machine). Thus the
  1019   counter-example we were able to find (by hand and by machine). Thus the
  1028   transitivity lemma should be formulated as:
  1020   transitivity lemma should be formulated as:
  1029 
  1021 
  1030   \begin{proposition}
  1022   \begin{conject}
  1031   Suppose @{term "\<turnstile> v\<^sub>1 : r"}, @{term "\<turnstile> v\<^sub>2 : r"} and @{term "\<turnstile> v\<^sub>3 : r"},
  1023   Suppose @{term "\<turnstile> v\<^sub>1 : r"}, @{term "\<turnstile> v\<^sub>2 : r"} and @{term "\<turnstile> v\<^sub>3 : r"},
  1032   and also @{text "|v\<^sub>1| = |v\<^sub>2| = |v\<^sub>3|"}.\\
  1024   and also @{text "|v\<^sub>1| = |v\<^sub>2| = |v\<^sub>3|"}.\\
  1033   If @{term "v\<^sub>1 >(r::rexp) (v\<^sub>2::val)"} and @{term "v\<^sub>2 >(r::rexp) (v\<^sub>3::val)"}
  1025   If @{term "v\<^sub>1 >(r::rexp) (v\<^sub>2::val)"} and @{term "v\<^sub>2 >(r::rexp) (v\<^sub>3::val)"}
  1034   then @{term "v\<^sub>1 >(r::rexp) (v\<^sub>3::val)"}.
  1026   then @{term "v\<^sub>1 >(r::rexp) (v\<^sub>3::val)"}.
  1035   \end{proposition}
  1027   \end{conject}
  1036 
  1028 
  1037   \noindent While we agree with Sulzmann and Lu that this property
  1029   \noindent While we agree with Sulzmann and Lu that this property
  1038   probably(!) holds, proving it seems not so straightforward: although one
  1030   probably(!) holds, proving it seems not so straightforward: although one
  1039   begins with the assumption that the values have the same flattening, this
  1031   begins with the assumption that the values have the same flattening, this
  1040   cannot be maintained as one descends into the induction. This is a problem
  1032   cannot be maintained as one descends into the induction. This is a problem