diff -r fc35eb54fdc9 -r fb08f41ca33d Paper/Paper.thy --- a/Paper/Paper.thy Wed Feb 02 15:43:22 2011 +0000 +++ b/Paper/Paper.thy Thu Feb 03 05:38:47 2011 +0000 @@ -24,29 +24,42 @@ text {* Regular languages are an important and well-understood subject in Computer - Science with many beautiful theorems and many useful algorithms. There is a - wide range of textbooks about this subject. Many of which, such as - \cite{Kozen97}, are aimed at students and contain very detailed - ``pencil-and-paper'' proofs. It seems natural to exercise theorem provers by - formalising these theorems and by verifying formally the algorithms. + Science, with many beautiful theorems and many useful algorithms. There is a + wide range of textbooks on this subject, many of which are aimed at + students containing very detailed ``pencil-and-paper'' proofs + (e.g.~\cite{Kozen97}). It seems natural to exercise theorem provers by + formalising these theorems and by verifying formally the algorithms. - There is however a problem: the typical approach to the subject of regular - languages is to introduce finite automata and define everything in terms of - automata. For example, a regular language is nearly always defined as one - where there is a finite deterministic automata that recognises all the - strings of the language. One benefit of this approach is that it is easy - to prove on paper that regular languages are closed under complementation: - one just has to exchange the accepting and non-accepting states in the corresponding - automata into non-accepting. The problem with automata is that they need - to be represented as graphs (nodes represent states and edges, transitions) - or as matrices (recording the transitions between states). Both representations - cannot be defined in terms of inductive datatypes, and a reasoning infrastructure - must be established manually. To our knowledge neither - Isabelle nor HOL4 nor Coq have a readily usable reasoning library for graphs - and matrices. Moreover, the reasoning about graphs can be quite cumbersome. - Cosider for example the frequently used operation of combinding two automata - into a new compound automaton. While on paper, this can be done by just forming - the disjoint union of the nodes, this does not work in typed systems \dots + There is however a problem with this: the typical approach to regular + languages is to introduce automata and then define everything in terms of + them. For example, a regular language is normally defined as one where + there is a finite deterministic automaton that recognises all the strings of + the language. One benefit of this approach is that it is easy to convince + oneself from the fact that regular languages are closed under + complementation: one just has to exchange the accepting and non-accepting + states in the automaton to obtain an automaton for the complement language. + + The problem lies with formalising such reasoning in a theorem + prover, in our case Isabelle/HOL. Automata need to be represented as graphs + or matrices, neither of which can be defined as inductive datatype.\footnote{In + some works functions are used to represent transitions, but they are also not + inductive datatypes.} This means we have to build our own reasoning infrastructure + for them, as neither Isabelle nor HOL4 support them with libraries. + Even worse, reasoning about graphs in typed languages can be a real hassle + making proofs quite clunky. Consider for example the operation of combining + two automata into a new automaton by connecting their + initial states to a new initial state (similarly with the accepting states):a + + \begin{center} + picture + \end{center} + + \noindent + How should we implement this operation? On paper we can just + form the disjoint union of the state nodes and add two more nodes---one for the + new initial state, the other for the new accepting state. + + @@ -55,7 +68,7 @@ Therefore instead of defining a regular language as being one where there exists an - automata that regognises all of its strings, we define + automata that recognises all of its strings, we define \begin{definition}[A Regular Language] A language @{text A} is regular, if there is a regular expression that matches all @@ -63,7 +76,7 @@ \end{definition} \noindent - {\bf Contributions:} A proof of the Myhil-Nerode Theorem based on regular expressions. The + {\bf Contributions:} A proof of the Myhill-Nerode Theorem based on regular expressions. The finiteness part of this theorem is proved using tagging-functions (which to our knowledge are novel in this context).