more to the intro
authorurbanc
Thu, 03 Feb 2011 09:54:19 +0000
changeset 61 070f543e2560
parent 60 fb08f41ca33d
child 62 d94209ad2880
more to the intro
Paper/Paper.thy
Paper/document/root.tex
tphols-2011/myhill.pdf
--- a/Paper/Paper.thy	Thu Feb 03 05:38:47 2011 +0000
+++ b/Paper/Paper.thy	Thu Feb 03 09:54:19 2011 +0000
@@ -26,29 +26,29 @@
   Regular languages are an important and well-understood subject in Computer
   Science, with many beautiful theorems and many useful algorithms. There is a
   wide range of textbooks on this subject, many of which are aimed at
-  students containing very detailed ``pencil-and-paper'' proofs
+  students and contain very detailed ``pencil-and-paper'' proofs
   (e.g.~\cite{Kozen97}). It seems natural to exercise theorem provers by
   formalising these theorems and by verifying formally the algorithms.
 
   There is however a problem with this: the typical approach to regular
-  languages is to introduce automata and then define everything in terms of
+  languages is to introduce finite automata and then define everything in terms of
   them.  For example, a regular language is normally defined as one where
   there is a finite deterministic automaton that recognises all the strings of
-  the language. One benefit of this approach is that it is easy to convince
+  the language. This approach has many benefits. One is that it is easy to convince
   oneself from the fact that regular languages are closed under
   complementation: one just has to exchange the accepting and non-accepting
-  states in the automaton to obtain an automaton for the complement language.
-  
+  states in the corresponding automaton to obtain an automaton for the complement language.
   The problem lies with formalising such reasoning in a theorem
   prover, in our case Isabelle/HOL. Automata need to be represented as graphs 
   or matrices, neither of which can be defined as inductive datatype.\footnote{In 
   some works functions are used to represent transitions, but they are also not 
   inductive datatypes.} This means we have to build our own reasoning infrastructure
-  for them, as neither Isabelle nor HOL4 support them with libraries.
-  Even worse, reasoning about graphs in typed languages can be a real hassle
-  making proofs quite clunky. Consider for example the operation of combining 
+  for them, as neither Isabelle nor HOL4 nor HOLlight support them with libraries.
+  
+  Even worse, reasoning about graphs in typed languages can be a real hassle. 
+  Consider for example the operation of combining 
   two automata into a new automaton by connecting their
-  initial states to a new initial state (similarly with the accepting states):a
+  initial states to a new initial state (similarly with the accepting states):
   
   \begin{center}
   picture
@@ -57,18 +57,31 @@
   \noindent
   How should we implement this operation? On paper we can just
   form the disjoint union of the state nodes and add two more nodes---one for the
-  new initial state, the other for the new accepting state.
+  new initial state, the other for the new accepting state. In a theorem
+  prover based on set-theory, this operaton can be more or less
+  straightforwardly implemented. But in a HOL-based theorem prover the
+  standard definition of disjoint unions as pairs
 
+  \begin{center}
+  definition
+  \end{center}
 
-  
-
+  \noindent
+  changes the type (from sets of nodes to sets of pairs). This means we
+  cannot formulate in this represeantation any property about \emph{all} 
+  automata---since there is no type quantification available in HOL-based
+  theorem provers. A working alternative is to give every state node an 
+  identity, for example a natural number, and then be careful to rename
+  these indentities appropriately when connecting two automata together. 
+  This results in very clunky side-proofs establishing that properties
+  are invariant under renaming. We are only aware of the formalisation 
+  of automata theory in Nuprl that carries this approach trough and is
+  quite substantial. 
   
-  
-
-
-
-  Therefore instead of defining a regular language as being one where there exists an
-  automata that recognises all of its strings, we define 
+  We will take a completely different approach to formalising theorems
+  about regular languages. Instead of defining a regular language as one 
+  where there exists an automaton that recognises all of its strings, we 
+  define a regular language as
 
   \begin{definition}[A Regular Language]
   A language @{text A} is regular, if there is a regular expression that matches all
@@ -76,6 +89,16 @@
   \end{definition}
   
   \noindent
+  The reason is that regular expressinons, unlike graphs and metrices, can
+  be eaily defined as inductive datatype and this means a reasoning infrastructre
+  comes for them in Isabelle for free. The purpose of this paper is to
+  show that a central and highly non-trivisl result about regular languages, 
+  namely the Myhill-Nerode theorem,  can be recreated only using regular 
+  expressions. In our approach we do not need to formalise graps or
+  metrices.
+  
+
+  \noindent
   {\bf Contributions:} A proof of the Myhill-Nerode Theorem based on regular expressions. The 
   finiteness part of this theorem is proved using tagging-functions (which to our knowledge
   are novel in this context).
--- a/Paper/document/root.tex	Thu Feb 03 05:38:47 2011 +0000
+++ b/Paper/document/root.tex	Thu Feb 03 09:54:19 2011 +0000
@@ -38,7 +38,7 @@
 theorem provers. The reason is that they need to be represented as graphs 
 or matrices, neither of which can be defined as inductive datatype. Also 
 operations, such as disjoint unions of graphs, are not easily formalisiable 
-in HOL. In contrast, regular expressions can be defined easily 
+in HOL. In contrast, regular expressions can be defined conveniently 
 as datatype and a corresponding reasoning infrastructure comes for 
 free. We show in this paper that a central result from formal 
 language theory---the Myhill-Nerode theorem---can be recreated 
Binary file tphols-2011/myhill.pdf has changed