diff -r fb08f41ca33d -r 070f543e2560 Paper/Paper.thy
--- a/Paper/Paper.thy	Thu Feb 03 05:38:47 2011 +0000
+++ b/Paper/Paper.thy	Thu Feb 03 09:54:19 2011 +0000
@@ -26,29 +26,29 @@
   Regular languages are an important and well-understood subject in Computer
   Science, with many beautiful theorems and many useful algorithms. There is a
   wide range of textbooks on this subject, many of which are aimed at
-  students containing very detailed ``pencil-and-paper'' proofs
+  students and contain very detailed ``pencil-and-paper'' proofs
   (e.g.~\cite{Kozen97}). It seems natural to exercise theorem provers by
   formalising these theorems and by verifying formally the algorithms.
 
   There is however a problem with this: the typical approach to regular
-  languages is to introduce automata and then define everything in terms of
+  languages is to introduce finite automata and then define everything in terms of
   them.  For example, a regular language is normally defined as one where
   there is a finite deterministic automaton that recognises all the strings of
-  the language. One benefit of this approach is that it is easy to convince
+  the language. This approach has many benefits. One is that it is easy to convince
   oneself from the fact that regular languages are closed under
   complementation: one just has to exchange the accepting and non-accepting
-  states in the automaton to obtain an automaton for the complement language.
-  
+  states in the corresponding automaton to obtain an automaton for the complement language.
   The problem lies with formalising such reasoning in a theorem
   prover, in our case Isabelle/HOL. Automata need to be represented as graphs 
   or matrices, neither of which can be defined as inductive datatype.\footnote{In 
   some works functions are used to represent transitions, but they are also not 
   inductive datatypes.} This means we have to build our own reasoning infrastructure
-  for them, as neither Isabelle nor HOL4 support them with libraries.
-  Even worse, reasoning about graphs in typed languages can be a real hassle
-  making proofs quite clunky. Consider for example the operation of combining 
+  for them, as neither Isabelle nor HOL4 nor HOLlight support them with libraries.
+  
+  Even worse, reasoning about graphs in typed languages can be a real hassle. 
+  Consider for example the operation of combining 
   two automata into a new automaton by connecting their
-  initial states to a new initial state (similarly with the accepting states):a
+  initial states to a new initial state (similarly with the accepting states):
   
   \begin{center}
   picture
@@ -57,18 +57,31 @@
   \noindent
   How should we implement this operation? On paper we can just
   form the disjoint union of the state nodes and add two more nodes---one for the
-  new initial state, the other for the new accepting state.
+  new initial state, the other for the new accepting state. In a theorem
+  prover based on set-theory, this operaton can be more or less
+  straightforwardly implemented. But in a HOL-based theorem prover the
+  standard definition of disjoint unions as pairs
 
+  \begin{center}
+  definition
+  \end{center}
 
-  
-
+  \noindent
+  changes the type (from sets of nodes to sets of pairs). This means we
+  cannot formulate in this represeantation any property about \emph{all} 
+  automata---since there is no type quantification available in HOL-based
+  theorem provers. A working alternative is to give every state node an 
+  identity, for example a natural number, and then be careful to rename
+  these indentities appropriately when connecting two automata together. 
+  This results in very clunky side-proofs establishing that properties
+  are invariant under renaming. We are only aware of the formalisation 
+  of automata theory in Nuprl that carries this approach trough and is
+  quite substantial. 
   
-  
-
-
-
-  Therefore instead of defining a regular language as being one where there exists an
-  automata that recognises all of its strings, we define 
+  We will take a completely different approach to formalising theorems
+  about regular languages. Instead of defining a regular language as one 
+  where there exists an automaton that recognises all of its strings, we 
+  define a regular language as
 
   \begin{definition}[A Regular Language]
   A language @{text A} is regular, if there is a regular expression that matches all
@@ -76,6 +89,16 @@
   \end{definition}
   
   \noindent
+  The reason is that regular expressinons, unlike graphs and metrices, can
+  be eaily defined as inductive datatype and this means a reasoning infrastructre
+  comes for them in Isabelle for free. The purpose of this paper is to
+  show that a central and highly non-trivisl result about regular languages, 
+  namely the Myhill-Nerode theorem,  can be recreated only using regular 
+  expressions. In our approach we do not need to formalise graps or
+  metrices.
+  
+
+  \noindent
   {\bf Contributions:} A proof of the Myhill-Nerode Theorem based on regular expressions. The 
   finiteness part of this theorem is proved using tagging-functions (which to our knowledge
   are novel in this context).