Paper/Paper.thy
changeset 121 1cf12a107b03
parent 120 c1f596c7f59e
child 122 ab6637008963
equal deleted inserted replaced
120:c1f596c7f59e 121:1cf12a107b03
    34   Setalt ("\<^raw:\ensuremath{\bigplus}>_" [1000] 999) and
    34   Setalt ("\<^raw:\ensuremath{\bigplus}>_" [1000] 999) and
    35   append_rexp2 ("_ \<^raw:\ensuremath{\triangleleft}> _" [100, 100] 100) and
    35   append_rexp2 ("_ \<^raw:\ensuremath{\triangleleft}> _" [100, 100] 100) and
    36   append_rhs_rexp ("_ \<^raw:\ensuremath{\triangleleft}> _" [100, 100] 50) and
    36   append_rhs_rexp ("_ \<^raw:\ensuremath{\triangleleft}> _" [100, 100] 50) and
    37   uminus ("\<^raw:\ensuremath{\overline{>_\<^raw:}}>" [100] 100) and
    37   uminus ("\<^raw:\ensuremath{\overline{>_\<^raw:}}>" [100] 100) and
    38   tag_str_ALT ("tag\<^isub>A\<^isub>L\<^isub>T _ _" [100, 100] 100) and
    38   tag_str_ALT ("tag\<^isub>A\<^isub>L\<^isub>T _ _" [100, 100] 100) and
    39   tag_str_ALT ("tag\<^isub>A\<^isub>L\<^isub>T _ _ _" [100, 100, 100] 100)
    39   tag_str_ALT ("tag\<^isub>A\<^isub>L\<^isub>T _ _ _" [100, 100, 100] 100) and
    40 
    40   tag_str_SEQ ("tag\<^isub>S\<^isub>E\<^isub>Q _ _" [100, 100] 100) and
       
    41   tag_str_SEQ ("tag\<^isub>S\<^isub>E\<^isub>Q _ _ _" [100, 100, 100] 100) and
       
    42   tag_str_STAR ("tag\<^isub>S\<^isub>T\<^isub>A\<^isub>R _" [100] 100) and
       
    43   tag_str_STAR ("tag\<^isub>S\<^isub>T\<^isub>A\<^isub>R _ _" [100, 100] 100)
    41 lemma meta_eq_app:
    44 lemma meta_eq_app:
    42   shows "f \<equiv> \<lambda>x. g x \<Longrightarrow> f x \<equiv> g x"
    45   shows "f \<equiv> \<lambda>x. g x \<Longrightarrow> f x \<equiv> g x"
    43   by auto
    46   by auto
    44 
    47 
    45 (*>*)
    48 (*>*)
   263   \noindent
   266   \noindent
   264   In @{text "(ii)"} we use the notation @{term "length s"} for the length of a
   267   In @{text "(ii)"} we use the notation @{term "length s"} for the length of a
   265   string; this property states that if @{term "[] \<notin> A"} then the lengths of
   268   string; this property states that if @{term "[] \<notin> A"} then the lengths of
   266   the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}.  We omit
   269   the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}.  We omit
   267   the proofs for these properties, but invite the reader to consult our
   270   the proofs for these properties, but invite the reader to consult our
   268   formalisation.\footnote{Available at ???}
   271   formalisation.\footnote{Available at \url{http://www4.in.tum.de/~urbanc/cgi-bin/repos.cgi/regexp/}}
   269 
   272 
   270   The notation in Isabelle/HOL for the quotient of a language @{text A} according to an 
   273   The notation in Isabelle/HOL for the quotient of a language @{text A} according to an 
   271   equivalence relation @{term REL} is @{term "A // REL"}. We will write 
   274   equivalence relation @{term REL} is @{term "A // REL"}. We will write 
   272   @{text "\<lbrakk>x\<rbrakk>\<^isub>\<approx>"} for the equivalence class defined 
   275   @{text "\<lbrakk>x\<rbrakk>\<^isub>\<approx>"} for the equivalence class defined 
   273   as @{text "{y | y \<approx> x}"}.
   276   as @{text "{y | y \<approx> x}"}.
   997   in which set, @{text A} or @{text B}, the string @{term "x @ z"} is. Fianlly we
  1000   in which set, @{text A} or @{text B}, the string @{term "x @ z"} is. Fianlly we
   998   can discharge this case by setting @{text A} to @{term "L r\<^isub>1"} and @{text B} to @{term "L r\<^isub>2"}.\qed
  1001   can discharge this case by setting @{text A} to @{term "L r\<^isub>1"} and @{text B} to @{term "L r\<^isub>2"}.\qed
   999   \end{proof}
  1002   \end{proof}
  1000 
  1003 
  1001 
  1004 
  1002   @{thm tag_str_SEQ_def[where ?L1.0="A" and ?L2.0="B"]}
  1005   \noindent
  1003  
  1006   The pattern in \eqref{pattern} is repeated for the other two cases. Unfortunately,
  1004   @{thm tag_str_STAR_def[where ?L1.0="A"]}
  1007   they are slightly more complicated. 
       
  1008 
       
  1009 
       
  1010   \begin{proof}[@{const SEQ}-Case]
       
  1011 
       
  1012   \begin{center}
       
  1013   @{thm tag_str_SEQ_def[where ?L1.0="A" and ?L2.0="B", THEN meta_eq_app]}
       
  1014   \end{center}
       
  1015   \end{proof}
       
  1016 
       
  1017   \begin{proof}[@{const STAR}-Case]
       
  1018 
       
  1019   \begin{center}
       
  1020   @{thm tag_str_STAR_def[where ?L1.0="A", THEN meta_eq_app]}
       
  1021   \end{center}
       
  1022   \end{proof}
  1005 *}
  1023 *}
  1006 
  1024 
  1007 
  1025 
  1008 
  1026 
  1009 section {* Conclusion and Related Work *}
  1027 section {* Conclusion and Related Work *}
  1044   expression, like there is for automata. For an implementation of a simple
  1062   expression, like there is for automata. For an implementation of a simple
  1045   regular expression matcher, whose correctness has been formally
  1063   regular expression matcher, whose correctness has been formally
  1046   established, we refer the reader to Owens and Slind \cite{OwensSlind08}.
  1064   established, we refer the reader to Owens and Slind \cite{OwensSlind08}.
  1047 
  1065 
  1048 
  1066 
  1049   Our formalisation consists of ??? lines of Isabelle/Isar code for the first
  1067   Our formalisation consists of 790 lines of Isabelle/Isar code for the first
  1050   direction and ??? for the second. While this might be seen as too large to
  1068   direction and 475 for the second, plus standard material about regular languages. 
       
  1069   While this might be seen as too large to
  1051   count as a concise proof pearl, this should be seen in the context of the
  1070   count as a concise proof pearl, this should be seen in the context of the
  1052   work done by Constable at al \cite{Constable00} who formalised the
  1071   work done by Constable at al \cite{Constable00} who formalised the
  1053   Myhill-Nerode theorem in Nuprl using automata. 
  1072   Myhill-Nerode theorem in Nuprl using automata. 
  1054   They write that their
  1073   They write that their
  1055   four-member team needed something on the magnitute of 18 months for their
  1074   four-member team needed something on the magnitute of 18 months for their
  1058   arguments. Unlike Constable et al, who were able to follow the proofs from
  1077   arguments. Unlike Constable et al, who were able to follow the proofs from
  1059   \cite{HopcroftUllman69}, we had to find our own arguments.  So for us the formalisation
  1078   \cite{HopcroftUllman69}, we had to find our own arguments.  So for us the formalisation
  1060   was not the bottleneck. It is hard to gauge the size of a formalisation in
  1079   was not the bottleneck. It is hard to gauge the size of a formalisation in
  1061   Nurpl, but from what is shown in the Nuprl Math Library about their
  1080   Nurpl, but from what is shown in the Nuprl Math Library about their
  1062   development it seems substantially larger than ours. The code of
  1081   development it seems substantially larger than ours. The code of
  1063   ours can be found under
  1082   ours can be found in the Mercurial Repository at
  1064 
  1083 
  1065   \begin{center}
  1084   \begin{center}
  1066   ???
  1085   \url{http://www4.in.tum.de/~urbanc/cgi-bin/repos.cgi/regexp/}
  1067   \end{center}
  1086   \end{center}
  1068 
  1087 
  1069 
  1088 
  1070   Our proof of the first direction is very much inspired by \emph{Brzozowski's
  1089   Our proof of the first direction is very much inspired by \emph{Brzozowski's
  1071   algebraic mehod} used to convert a finite automaton to a regular
  1090   algebraic mehod} used to convert a finite automaton to a regular
  1093   calculate the derivatives and then to count them. Therefore we preferred our
  1112   calculate the derivatives and then to count them. Therefore we preferred our
  1094   direct method of using tagging-functions involving equivalence classes. This
  1113   direct method of using tagging-functions involving equivalence classes. This
  1095   is also where our method shines, because we can completely side-step the
  1114   is also where our method shines, because we can completely side-step the
  1096   standard argument \cite{Kozen97} where automata need to be composed, which
  1115   standard argument \cite{Kozen97} where automata need to be composed, which
  1097   as stated in the Introduction is not so convenient to formalise in a 
  1116   as stated in the Introduction is not so convenient to formalise in a 
  1098   HOL-based theorem prover.
  1117   HOL-based theorem prover. However, it is also the direction where we had to 
       
  1118   spend most of the `conceptual' time, as our proof-argument based on tagging functions
       
  1119   is new for establishing the Myhill-Nerode theorem.
  1099 
  1120 
  1100 *}
  1121 *}
  1101 
  1122 
  1102 
  1123 
  1103 (*<*)
  1124 (*<*)