Paper/Paper.thy
changeset 114 c5eb5f3065ae
parent 113 ec774952190c
child 115 c5f138b5fc88
equal deleted inserted replaced
113:ec774952190c 114:c5eb5f3065ae
   181 
   181 
   182   Because of these problems to do with representing automata, there seems
   182   Because of these problems to do with representing automata, there seems
   183   to be no substantial formalisation of automata theory and regular languages 
   183   to be no substantial formalisation of automata theory and regular languages 
   184   carried out in HOL-based theorem provers. Nipkow establishes in 
   184   carried out in HOL-based theorem provers. Nipkow establishes in 
   185   \cite{Nipkow98} the link between regular expressions and automata in
   185   \cite{Nipkow98} the link between regular expressions and automata in
   186   the context of lexing. The only larger formalisations of automata theory 
   186   the context of lexing. Berghofer and Reiter formalise automata working over 
       
   187   bit strings in the context of Presburger arithmetic \cite{BerghoferReiter09}.
       
   188   The only larger formalisations of automata theory 
   187   are carried out in Nuprl \cite{Constable00} and in Coq (for example 
   189   are carried out in Nuprl \cite{Constable00} and in Coq (for example 
   188   \cite{Filliatre97}).
   190   \cite{Filliatre97}).
   189   
   191   
   190   In this paper, we will not attempt to formalise automata theory in
   192   In this paper, we will not attempt to formalise automata theory in
   191   Isabelle/HOL, but take a completely different approach to regular
   193   Isabelle/HOL, but take a completely different approach to regular
   844 
   846 
   845 
   847 
   846 section {* Myhill-Nerode, Second Part *}
   848 section {* Myhill-Nerode, Second Part *}
   847 
   849 
   848 text {*
   850 text {*
   849   TO BE DONE
       
   850 
       
   851 
   851 
   852   \begin{theorem}
   852   \begin{theorem}
   853   Given @{text "r"} is a regular expressions, then @{thm Myhill_Nerode2}.
   853   Given @{text "r"} is a regular expressions, then @{thm Myhill_Nerode2}.
   854   \end{theorem}  
   854   \end{theorem}  
   855 
   855 
   856 %  \begin{proof}
   856   \begin{proof}
   857 %  By induction on the structure of @{text r}. The cases for @{const NULL}, @{const EMPTY}
   857   By induction on the structure of @{text r}. The cases for @{const NULL}, @{const EMPTY}
   858 %  and @{const CHAR} are straightforward, because we can easily establish
   858   and @{const CHAR} are straightforward, because we can easily establish
   859 
   859 
   860 %  \begin{center}
   860   \begin{center}
   861 %  \begin{tabular}{l}
   861   \begin{tabular}{l}
   862 %  @{thm quot_null_eq}\\
   862   @{thm quot_null_eq}\\
   863 %  @{thm quot_empty_subset}\\
   863   @{thm quot_empty_subset}\\
   864 %  @{thm quot_char_subset}
   864   @{thm quot_char_subset}
   865 %  \end{tabular}
   865   \end{tabular}
   866 %  \end{center}
   866   \end{center}
   867 %
   867 
   868 %  \end{proof}
   868   \end{proof}
   869 
   869 
   870 
   870 
   871 %  @{thm tag_str_ALT_def[where ?L1.0="A" and ?L2.0="B"]}
   871   @{thm tag_str_ALT_def[where ?L1.0="A" and ?L2.0="B"]}
   872 
   872 
   873 %  @{thm tag_str_SEQ_def[where ?L1.0="A" and ?L2.0="B"]}
   873   @{thm tag_str_SEQ_def[where ?L1.0="A" and ?L2.0="B"]}
   874  
   874  
   875 %  @{thm tag_str_STAR_def[where ?L1.0="A"]}
   875   @{thm tag_str_STAR_def[where ?L1.0="A"]}
   876 *}
   876 *}
   877 
   877 
   878 
   878 
   879 section {* Conclusion and Related Work *}
   879 section {* Conclusion and Related Work *}
   880 
   880 
   900   @{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"}
   900   @{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"}
   901   \end{center}
   901   \end{center}
   902  
   902  
   903   \noindent
   903   \noindent
   904   holds for any strings @{text "s\<^isub>1"} and @{text
   904   holds for any strings @{text "s\<^isub>1"} and @{text
   905   "s\<^isub>2"}. Therefore @{text A} and @{term "-A"} give rise to the same
   905   "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"} give rise to the same
   906   partitions. 
   906   partitions.  Proving the existence of such a regular expression via automata would 
   907   Proving the same result via automata would be quite involved. It includes the
   907   be quite involved. It includes the
   908   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
   908   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
   909   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
   909   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
   910   regular expression.
   910   regular expression.
   911 
   911 
   912   Our formalisation consists of ??? lines of Isabelle/Isar code for the first
   912   Our formalisation consists of ??? lines of Isabelle/Isar code for the first
   913   direction and ??? for the second. While this might be seen as too large to
   913   direction and ??? for the second. While this might be seen as too large to
   914   count as a concise proof pearl, this should be seen in the context of the
   914   count as a concise proof pearl, this should be seen in the context of the
   915   work done by Constable at al \cite{Constable00} who formalised the
   915   work done by Constable at al \cite{Constable00} who formalised the
   916   Myhill-Nerode theorem in Nuprl using automata. This is the most 
   916   Myhill-Nerode theorem in Nuprl using automata. 
   917   They write that their
   917   They write that their
   918   four-member team needed something on the magnitute of 18 months to formalise
   918   four-member team needed something on the magnitute of 18 months for their
   919   the Myhill-Nerode theorem. The estimate for our formalisation is that we
   919   formalisation. The estimate for our formalisation is that we
   920   needed approximately 3 months and this included the time to find our proof
   920   needed approximately 3 months and this included the time to find our proof
   921   arguments. Unlike Constable et al, who were able to follow the proofs from
   921   arguments. Unlike Constable et al, who were able to follow the proofs from
   922   \cite{???}, we had to find our own arguments.  So for us the formalisation
   922   \cite{HopcroftUllman69}, we had to find our own arguments.  So for us the formalisation
   923   was not the bottleneck. It is hard to gauge the size of a formalisation in
   923   was not the bottleneck. It is hard to gauge the size of a formalisation in
   924   Nurpl, but from what is shown in the Nuprl Math Library about their
   924   Nurpl, but from what is shown in the Nuprl Math Library about their
   925   development it seems substantially larger than ours. The code of
   925   development it seems substantially larger than ours. The code of
   926   ours can be found under
   926   ours can be found under
   927 
   927 
   932 
   932 
   933   Our proof of the first direction is very much inspired by \emph{Brzozowski's
   933   Our proof of the first direction is very much inspired by \emph{Brzozowski's
   934   algebraic mehod} used to convert a finite automaton to a regular
   934   algebraic mehod} used to convert a finite automaton to a regular
   935   expression \cite{Brzozowski64}. The close connection can be seen by considering the equivalence
   935   expression \cite{Brzozowski64}. The close connection can be seen by considering the equivalence
   936   classes as the states of the minimal automaton for the regular language.
   936   classes as the states of the minimal automaton for the regular language.
   937   However there are some subtle differences. If we identify equivalence
   937   However there are some subtle differences. Since we identify equivalence
   938   classes with the states of the automaton, then the most natural choice is to
   938   classes with the states of the automaton, then the most natural choice is to
   939   characterise each state with the set of strings starting from the initial
   939   characterise each state with the set of strings starting from the initial
   940   state leading up to that state. Usually, however, the states are characterised as the
   940   state leading up to that state. Usually, however, the states are characterised as the
   941   ones starting from that state leading to the terminal states.  The first
   941   ones starting from that state leading to the terminal states.  The first
   942   choice has consequences how the initial equational system is set up. We have
   942   choice has consequences how the initial equational system is set up. We have
   946 
   946 
   947   We briefly considered using the method Brzozowski presented in the Appendix
   947   We briefly considered using the method Brzozowski presented in the Appendix
   948   of~\cite{Brzozowski64} in order to prove the second direction of the
   948   of~\cite{Brzozowski64} in order to prove the second direction of the
   949   Myhill-Nerode theorem. There he calculates the derivatives for regular
   949   Myhill-Nerode theorem. There he calculates the derivatives for regular
   950   expressions and shows that there can be only finitely many of them. We could
   950   expressions and shows that there can be only finitely many of them. We could
   951   use as the tag of a string @{text s} the derivative of a regular expression
   951   have used as the tag of a string @{text s} the derivative of a regular expression
   952   generated with respect to @{text s}.  Using the fact that two strings are
   952   generated with respect to @{text s}.  Using the fact that two strings are
   953   Myhill-Nerode related whenever their derivative is the same together with
   953   Myhill-Nerode related whenever their derivative is the same together with
   954   the fact that there are only finitely many derivatives for a regular
   954   the fact that there are only finitely many derivatives for a regular
   955   expression would give us the same argument as ours. However it seems not so easy to
   955   expression would give us the same argument as ours. However it seems not so easy to
   956   calculate the derivatives and then to count them. Therefore we preferred our
   956   calculate the derivatives and then to count them. Therefore we preferred our