Journal/Paper.thy
changeset 175 edc642266a82
parent 174 2b414a8a7132
child 176 6969de1eb96b
--- a/Journal/Paper.thy	Tue Jul 26 18:12:07 2011 +0000
+++ b/Journal/Paper.thy	Wed Jul 27 12:32:28 2011 +0000
@@ -34,6 +34,7 @@
 abbreviation "ATOM \<equiv> Atom"
 abbreviation "PLUS \<equiv> Plus"
 abbreviation "TIMES \<equiv> Times"
+abbreviation "TIMESS \<equiv> Times_set"
 abbreviation "STAR \<equiv> Star"
 
 
@@ -130,28 +131,36 @@
   Regular languages are an important and well-understood subject in Computer
   Science, with many beautiful theorems and many useful algorithms. There is a
   wide range of textbooks on this subject, many of which are aimed at students
-  and contain very detailed `pencil-and-paper' proofs
-  (e.g.~\cite{Kozen97, HopcroftUllman69}). It seems natural to exercise theorem provers by
+  and contain very detailed `pencil-and-paper' proofs (e.g.~\cite{Kozen97,
+  HopcroftUllman69}). It seems natural to exercise theorem provers by
   formalising the theorems and by verifying formally the algorithms.  A
   popular choice for a theorem prover would be one based on Higher-Order Logic
-  (HOL), for example HOL4, HOLlight and Isabelle/HOL. For our development 
-  we will use the latter. One distinguishing feature of HOL is it's
-  type system, which is based on Church's Simple Theory of Types \cite{Church40}.  The
-  limitations of this type system are one of the underlying motivations for the 
-  work presented in this paper.
+  (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development
+  presented in this paper we will use the latter. HOL is a predicate calculus
+  that allows quantification over predicate variables. Its type system is
+  based on Church's Simple Theory of Types \cite{Church40}.  Although 
+  many mathematical concepts can be conveniently expressed in HOL, there are some
+  limitations that hurt badly, if one attempts a simple-minded formalisation
+  of regular languages in it.
+
+  The typical approach to regular languages is to introduce finite automata
+  and then define everything in terms of them \cite{Kozen97}.  For example, 
+  a regular language is normally defined as:
 
-  The typical approach to regular languages is to
-  introduce finite automata and then define everything in terms of them.  For
-  example, a regular language is normally defined as one whose strings are
-  recognised by a finite deterministic automaton. This approach has many
-  benefits. Among them is the fact that it is easy to convince oneself that
-  regular languages are closed under complementation: one just has to exchange
-  the accepting and non-accepting states in the corresponding automaton to
-  obtain an automaton for the complement language.  The problem, however, lies
-  with formalising such reasoning in a HOL-based theorem prover. Automata are
-  built up from states and transitions that need to be represented as graphs,
-  matrices or functions, none of which can be defined as an inductive
-  datatype.
+  \begin{dfntn}\label{baddef}
+  A language @{text A} is \emph{regular}, provided there is a 
+  finite deterministic automaton that recognises all strings of @{text "A"}.
+  \end{dfntn}
+
+  \noindent  
+  This approach has many benefits. Among them is the fact that it is easy to
+  convince oneself that regular languages are closed under complementation:
+  one just has to exchange the accepting and non-accepting states in the
+  corresponding automaton to obtain an automaton for the complement language.
+  The problem, however, lies with formalising such reasoning in a HOL-based
+  theorem prover. Automata are built up from states and transitions that need
+  to be represented as graphs, matrices or functions, none of which can be
+  defined as an inductive datatype.
 
   In case of graphs and matrices, this means we have to build our own
   reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
@@ -231,8 +240,9 @@
   pairs. Using this definition for disjoint union means we do not have a
   single type for automata. As a result we will not be able to define a regular
   language as one for which there exists an automaton that recognises all its
-  strings, since there is no type quantification available in HOL (unlike in Coq, for
-  example).
+  strings. This is because we cannot make a definition in HOL that is polymorphic in 
+  the state type and also there is no type quantification available in HOL (unlike 
+  in Coq, for example).
 
   An alternative, which provides us with a single type for automata, is to give every 
   state node an identity, for example a natural
@@ -285,22 +295,23 @@
   larger formalisations of automata theory are carried out in Nuprl
   \cite{Constable00} and in Coq \cite{Filliatre97}.
 
-  One might also consider the Myhill-Nerode theorem as well-worn stock
-  material where everything is clear. However, paper proofs of this theorem
-  often involve subtle side-conditions which are easily overlooked, but which
-  make formal reasoning rather painful. For example Kozen's proof requires
-  that the automata do not have inaccessible states \cite{Kozen97}. Another
-  subtle side-condition is completeness of automata: 
-  automata need to have total transition functions and at most one `sink'
-  state from which there is no connection to a final state (Brozowski mentions
-  this side-condition in connection with state complexity
+  Also one might consider automata theory as well-worn stock material where
+  everything is crystal clear. However, paper proofs about automata often
+  involve subtle side-conditions which are easily overlooked, but which make
+  formal reasoning rather painful. For example Kozen's proof of the
+  Myhill-Nerode theorem requires that the automata do not have inaccessible
+  states \cite{Kozen97}. Another subtle side-condition is completeness of
+  automata: automata need to have total transition functions and at most one
+  `sink' state from which there is no connection to a final state (Brozowski
+  mentions this side-condition in connection with state complexity
   \cite{Brozowski10}). Such side-conditions mean that if we define a regular
-  language as one for which there exists \emph{any} finite automaton, then we
-  need a lemma which ensures that another equivalent can be found satisfying the
-  side-condition. Unfortunately, such `little' and `obvious' lemmas make 
-  formalisations of results in automata theory hair-pulling experiences.
+  language as one for which there exists \emph{a} finite automaton that
+  recognises all its strings (Def.~\ref{baddef}), then we need a lemma which
+  ensures that another equivalent can be found satisfying the
+  side-condition. Unfortunately, such `little' and `obvious' lemmas make
+  formalisations of automata theory hair-pulling experiences.
 
-  
+
   In this paper, we will not attempt to formalise automata theory in
   Isabelle/HOL nor will we attempt to formalise automata proofs from the
   literature, but take a different approach to regular languages than is
@@ -315,33 +326,35 @@
   
   \noindent
   The reason is that regular expressions, unlike graphs, matrices and
-  functions, can be easily defined as an inductive datatype. No side-conditions
-  will be needed for regular expressions. Moreover, a reasoning infrastructure 
-  (like induction and recursion) comes for free in HOL-based theorem provers. 
-  This has recently been exploited in HOL4 with a formalisation of
-  regular expression matching based on derivatives \cite{OwensSlind08} and
-  with an equivalence checker for regular expressions in Isabelle/HOL
-  \cite{KraussNipkow11}.  The main purpose of this paper is to show that a central
-  result about regular languages---the Myhill-Nerode theorem---can be
-  recreated by only using regular expressions. This theorem gives necessary
-  and sufficient conditions for when a language is regular. As a corollary of
-  this theorem we can easily establish the usual closure properties, including
-  complementation, for regular languages.\medskip
+  functions, can be easily defined as an inductive datatype. A reasoning
+  infrastructure (like induction and recursion) comes then for free in
+  HOL. Moreover, no side-conditions will be needed for regular expressions,
+  like we usually need for automata. This convenience of regular expressions has
+  recently been exploited in HOL4 with a formalisation of regular expression
+  matching based on derivatives \cite{OwensSlind08} and with an equivalence
+  checker for regular expressions in Isabelle/HOL \cite{KraussNipkow11}.  The
+  main purpose of this paper is to show that a central result about regular
+  languages---the Myhill-Nerode theorem---can be recreated by only using
+  regular expressions. This theorem gives necessary and sufficient conditions
+  for when a language is regular. As a corollary of this theorem we can easily
+  establish the usual closure properties, including complementation, for
+  regular languages.\medskip
   
   \noindent 
-  {\bf Contributions:} There is an extensive literature on regular
-  languages.  To our best knowledge, our proof of the Myhill-Nerode theorem is
-  the first that is based on regular expressions, only. The part of this
-  theorem stating that finitely many partitions imply regularity of the
-  language is proved by an argument about solving equational sytems.  This
-  argument appears to be folklore. For the other part, we give two proofs: one
-  direct proof using certain tagging-functions, and another indirect proof
-  using Antimirov's partial derivatives \cite{Antimirov95}. Again to our best
-  knowledge, the tagging-functions have not been used before to establish the
-  Myhill-Nerode theorem. Derivatives of regular expressions have been used
-  extensively in the literature, unlike partial derivatives. However, partial
-  derivatives are more suitable in the context of the Myhill-Nerode theorem,
-  since it is easier to establish formally their finiteness result.
+  {\bf Contributions:} There is an extensive literature on regular languages.
+  To our best knowledge, our proof of the Myhill-Nerode theorem is the first
+  that is based on regular expressions, only. The part of this theorem stating
+  that finitely many partitions imply regularity of the language is proved by
+  an argument about solving equational sytems.  This argument appears to be
+  folklore. For the other part, we give two proofs: one direct proof using
+  certain tagging-functions, and another indirect proof using Antimirov's
+  partial derivatives \cite{Antimirov95}. Again to our best knowledge, the
+  tagging-functions have not been used before to establish the Myhill-Nerode
+  theorem. Derivatives of regular expressions have been used widely in the
+  literature about regular expressions. However, partial derivatives are more
+  suitable in the context of the Myhill-Nerode theorem, since it is easier to
+  establish formally their finiteness result.
+
 *}
 
 section {* Preliminaries *}
@@ -1544,22 +1557,24 @@
   derivations (an example is given in \cite[Page~141]{Sakarovitch09}).
   Reasoning modulo ACI can be done, but it is very painful in a theorem prover.
 
+  Fortunately, there is a much simpler approach using partial
+  derivatives introduced by Antimirov \cite{Antimirov95}.
 
-  in order to prove the second
-  direction of the Myhill-Nerode theorem. There he calculates the
-  derivatives for regular expressions and shows that for every
-  language there can be only finitely many of them %derivations (if
-  regarded equal modulo ACI). We could have used as tagging-function
-  the set of derivatives of a regular expression with respect to a
-  language.  Using the fact that two strings are Myhill-Nerode related
-  whenever their derivative is the same, together with the fact that
-  there are only finitely such derivatives would give us a similar
-  argument as ours. However it seems not so easy to calculate the set
-  of derivatives modulo ACI. Therefore we preferred our direct method
-  of using tagging-functions. 
-
-  The problem of finiteness modulo ACI can be avoided by using partial
-  derivatives introduced by Antimirov \cite{Antimirov}.
+  \begin{center}
+  \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
+  @{thm (lhs) pder.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) pder.simps(1)}\\
+  @{thm (lhs) pder.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) pder.simps(2)}\\
+  @{thm (lhs) pder.simps(3)[where c'="d"]}  & @{text "\<equiv>"} & @{thm (rhs) pder.simps(3)[where c'="d"]}\\
+  @{thm (lhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}  
+     & @{text "\<equiv>"} & @{thm (rhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
+  @{thm (lhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}  
+     & @{text "\<equiv>"}\\ 
+  \multicolumn{3}{@ {\hspace{20mm}}l@ {}}{@{thm (rhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}}\\
+  @{thm (lhs) pder.simps(6)}  & @{text "\<equiv>"} & @{thm (rhs) pder.simps(6)}\smallskip\\
+  @{thm (lhs) pders.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) pders.simps(1)}\\
+  @{thm (lhs) pders.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) pders.simps(2)}\\
+  \end{tabular}
+  \end{center}
 
 *}