# HG changeset patch # User urbanc # Date 1296711527 0 # Node ID fb08f41ca33d4bfd7ca5a17364cea9393fbc72d0 # Parent fc35eb54fdc9aae00c521ccd1a86768884e3d2c0 a bit more tuning on the introduction diff -r fc35eb54fdc9 -r fb08f41ca33d Myhill_1.thy --- a/Myhill_1.thy Wed Feb 02 15:43:22 2011 +0000 +++ b/Myhill_1.thy Thu Feb 03 05:38:47 2011 +0000 @@ -32,7 +32,8 @@ Sequential composition of two languages @{text "L1"} and @{text "L2"} *} -definition Seq :: "lang \ lang \ lang" (infixr ";;" 100) +definition + Seq :: "lang \ lang \ lang" (infixr ";;" 100) where "A ;; B = {s\<^isub>1 @ s\<^isub>2 | s\<^isub>1 s\<^isub>2. s\<^isub>1 \ A \ s\<^isub>2 \ B}" diff -r fc35eb54fdc9 -r fb08f41ca33d Paper/Paper.thy --- a/Paper/Paper.thy Wed Feb 02 15:43:22 2011 +0000 +++ b/Paper/Paper.thy Thu Feb 03 05:38:47 2011 +0000 @@ -24,29 +24,42 @@ text {* Regular languages are an important and well-understood subject in Computer - Science with many beautiful theorems and many useful algorithms. There is a - wide range of textbooks about this subject. Many of which, such as - \cite{Kozen97}, are aimed at students and contain very detailed - ``pencil-and-paper'' proofs. It seems natural to exercise theorem provers by - formalising these theorems and by verifying formally the algorithms. + Science, with many beautiful theorems and many useful algorithms. There is a + wide range of textbooks on this subject, many of which are aimed at + students containing very detailed ``pencil-and-paper'' proofs + (e.g.~\cite{Kozen97}). It seems natural to exercise theorem provers by + formalising these theorems and by verifying formally the algorithms. - There is however a problem: the typical approach to the subject of regular - languages is to introduce finite automata and define everything in terms of - automata. For example, a regular language is nearly always defined as one - where there is a finite deterministic automata that recognises all the - strings of the language. One benefit of this approach is that it is easy - to prove on paper that regular languages are closed under complementation: - one just has to exchange the accepting and non-accepting states in the corresponding - automata into non-accepting. The problem with automata is that they need - to be represented as graphs (nodes represent states and edges, transitions) - or as matrices (recording the transitions between states). Both representations - cannot be defined in terms of inductive datatypes, and a reasoning infrastructure - must be established manually. To our knowledge neither - Isabelle nor HOL4 nor Coq have a readily usable reasoning library for graphs - and matrices. Moreover, the reasoning about graphs can be quite cumbersome. - Cosider for example the frequently used operation of combinding two automata - into a new compound automaton. While on paper, this can be done by just forming - the disjoint union of the nodes, this does not work in typed systems \dots + There is however a problem with this: the typical approach to regular + languages is to introduce automata and then define everything in terms of + them. For example, a regular language is normally defined as one where + there is a finite deterministic automaton that recognises all the strings of + the language. One benefit of this approach is that it is easy to convince + oneself from the fact that regular languages are closed under + complementation: one just has to exchange the accepting and non-accepting + states in the automaton to obtain an automaton for the complement language. + + The problem lies with formalising such reasoning in a theorem + prover, in our case Isabelle/HOL. Automata need to be represented as graphs + or matrices, neither of which can be defined as inductive datatype.\footnote{In + some works functions are used to represent transitions, but they are also not + inductive datatypes.} This means we have to build our own reasoning infrastructure + for them, as neither Isabelle nor HOL4 support them with libraries. + Even worse, reasoning about graphs in typed languages can be a real hassle + making proofs quite clunky. Consider for example the operation of combining + two automata into a new automaton by connecting their + initial states to a new initial state (similarly with the accepting states):a + + \begin{center} + picture + \end{center} + + \noindent + How should we implement this operation? On paper we can just + form the disjoint union of the state nodes and add two more nodes---one for the + new initial state, the other for the new accepting state. + + @@ -55,7 +68,7 @@ Therefore instead of defining a regular language as being one where there exists an - automata that regognises all of its strings, we define + automata that recognises all of its strings, we define \begin{definition}[A Regular Language] A language @{text A} is regular, if there is a regular expression that matches all @@ -63,7 +76,7 @@ \end{definition} \noindent - {\bf Contributions:} A proof of the Myhil-Nerode Theorem based on regular expressions. The + {\bf Contributions:} A proof of the Myhill-Nerode Theorem based on regular expressions. The finiteness part of this theorem is proved using tagging-functions (which to our knowledge are novel in this context). diff -r fc35eb54fdc9 -r fb08f41ca33d Paper/document/root.tex --- a/Paper/document/root.tex Wed Feb 02 15:43:22 2011 +0000 +++ b/Paper/document/root.tex Thu Feb 03 05:38:47 2011 +0000 @@ -35,9 +35,9 @@ introduce the subject by describing finite automata and only mentioning on the side a connection with regular expressions. Unfortunately, automata are a hassle for formalisations in HOL-based -theorem provers. The reason is they need to be represented as graphs -or matrices, neither of which can be easily defined as datatype. Also -operations, such as disjoint union of graphs, are not easily formalisiable +theorem provers. The reason is that they need to be represented as graphs +or matrices, neither of which can be defined as inductive datatype. Also +operations, such as disjoint unions of graphs, are not easily formalisiable in HOL. In contrast, regular expressions can be defined easily as datatype and a corresponding reasoning infrastructure comes for free. We show in this paper that a central result from formal diff -r fc35eb54fdc9 -r fb08f41ca33d tphols-2011/myhill.pdf Binary file tphols-2011/myhill.pdf has changed