diff -r 2b414a8a7132 -r edc642266a82 Journal/Paper.thy --- a/Journal/Paper.thy Tue Jul 26 18:12:07 2011 +0000 +++ b/Journal/Paper.thy Wed Jul 27 12:32:28 2011 +0000 @@ -34,6 +34,7 @@ abbreviation "ATOM \ Atom" abbreviation "PLUS \ Plus" abbreviation "TIMES \ Times" +abbreviation "TIMESS \ Times_set" abbreviation "STAR \ Star" @@ -130,28 +131,36 @@ Regular languages are an important and well-understood subject in Computer Science, with many beautiful theorems and many useful algorithms. There is a wide range of textbooks on this subject, many of which are aimed at students - and contain very detailed `pencil-and-paper' proofs - (e.g.~\cite{Kozen97, HopcroftUllman69}). It seems natural to exercise theorem provers by + and contain very detailed `pencil-and-paper' proofs (e.g.~\cite{Kozen97, + HopcroftUllman69}). It seems natural to exercise theorem provers by formalising the theorems and by verifying formally the algorithms. A popular choice for a theorem prover would be one based on Higher-Order Logic - (HOL), for example HOL4, HOLlight and Isabelle/HOL. For our development - we will use the latter. One distinguishing feature of HOL is it's - type system, which is based on Church's Simple Theory of Types \cite{Church40}. The - limitations of this type system are one of the underlying motivations for the - work presented in this paper. + (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development + presented in this paper we will use the latter. HOL is a predicate calculus + that allows quantification over predicate variables. Its type system is + based on Church's Simple Theory of Types \cite{Church40}. Although + many mathematical concepts can be conveniently expressed in HOL, there are some + limitations that hurt badly, if one attempts a simple-minded formalisation + of regular languages in it. + + The typical approach to regular languages is to introduce finite automata + and then define everything in terms of them \cite{Kozen97}. For example, + a regular language is normally defined as: - The typical approach to regular languages is to - introduce finite automata and then define everything in terms of them. For - example, a regular language is normally defined as one whose strings are - recognised by a finite deterministic automaton. This approach has many - benefits. Among them is the fact that it is easy to convince oneself that - regular languages are closed under complementation: one just has to exchange - the accepting and non-accepting states in the corresponding automaton to - obtain an automaton for the complement language. The problem, however, lies - with formalising such reasoning in a HOL-based theorem prover. Automata are - built up from states and transitions that need to be represented as graphs, - matrices or functions, none of which can be defined as an inductive - datatype. + \begin{dfntn}\label{baddef} + A language @{text A} is \emph{regular}, provided there is a + finite deterministic automaton that recognises all strings of @{text "A"}. + \end{dfntn} + + \noindent + This approach has many benefits. Among them is the fact that it is easy to + convince oneself that regular languages are closed under complementation: + one just has to exchange the accepting and non-accepting states in the + corresponding automaton to obtain an automaton for the complement language. + The problem, however, lies with formalising such reasoning in a HOL-based + theorem prover. Automata are built up from states and transitions that need + to be represented as graphs, matrices or functions, none of which can be + defined as an inductive datatype. In case of graphs and matrices, this means we have to build our own reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor @@ -231,8 +240,9 @@ pairs. Using this definition for disjoint union means we do not have a single type for automata. As a result we will not be able to define a regular language as one for which there exists an automaton that recognises all its - strings, since there is no type quantification available in HOL (unlike in Coq, for - example). + strings. This is because we cannot make a definition in HOL that is polymorphic in + the state type and also there is no type quantification available in HOL (unlike + in Coq, for example). An alternative, which provides us with a single type for automata, is to give every state node an identity, for example a natural @@ -285,22 +295,23 @@ larger formalisations of automata theory are carried out in Nuprl \cite{Constable00} and in Coq \cite{Filliatre97}. - One might also consider the Myhill-Nerode theorem as well-worn stock - material where everything is clear. However, paper proofs of this theorem - often involve subtle side-conditions which are easily overlooked, but which - make formal reasoning rather painful. For example Kozen's proof requires - that the automata do not have inaccessible states \cite{Kozen97}. Another - subtle side-condition is completeness of automata: - automata need to have total transition functions and at most one `sink' - state from which there is no connection to a final state (Brozowski mentions - this side-condition in connection with state complexity + Also one might consider automata theory as well-worn stock material where + everything is crystal clear. However, paper proofs about automata often + involve subtle side-conditions which are easily overlooked, but which make + formal reasoning rather painful. For example Kozen's proof of the + Myhill-Nerode theorem requires that the automata do not have inaccessible + states \cite{Kozen97}. Another subtle side-condition is completeness of + automata: automata need to have total transition functions and at most one + `sink' state from which there is no connection to a final state (Brozowski + mentions this side-condition in connection with state complexity \cite{Brozowski10}). Such side-conditions mean that if we define a regular - language as one for which there exists \emph{any} finite automaton, then we - need a lemma which ensures that another equivalent can be found satisfying the - side-condition. Unfortunately, such `little' and `obvious' lemmas make - formalisations of results in automata theory hair-pulling experiences. + language as one for which there exists \emph{a} finite automaton that + recognises all its strings (Def.~\ref{baddef}), then we need a lemma which + ensures that another equivalent can be found satisfying the + side-condition. Unfortunately, such `little' and `obvious' lemmas make + formalisations of automata theory hair-pulling experiences. - + In this paper, we will not attempt to formalise automata theory in Isabelle/HOL nor will we attempt to formalise automata proofs from the literature, but take a different approach to regular languages than is @@ -315,33 +326,35 @@ \noindent The reason is that regular expressions, unlike graphs, matrices and - functions, can be easily defined as an inductive datatype. No side-conditions - will be needed for regular expressions. Moreover, a reasoning infrastructure - (like induction and recursion) comes for free in HOL-based theorem provers. - This has recently been exploited in HOL4 with a formalisation of - regular expression matching based on derivatives \cite{OwensSlind08} and - with an equivalence checker for regular expressions in Isabelle/HOL - \cite{KraussNipkow11}. The main purpose of this paper is to show that a central - result about regular languages---the Myhill-Nerode theorem---can be - recreated by only using regular expressions. This theorem gives necessary - and sufficient conditions for when a language is regular. As a corollary of - this theorem we can easily establish the usual closure properties, including - complementation, for regular languages.\medskip + functions, can be easily defined as an inductive datatype. A reasoning + infrastructure (like induction and recursion) comes then for free in + HOL. Moreover, no side-conditions will be needed for regular expressions, + like we usually need for automata. This convenience of regular expressions has + recently been exploited in HOL4 with a formalisation of regular expression + matching based on derivatives \cite{OwensSlind08} and with an equivalence + checker for regular expressions in Isabelle/HOL \cite{KraussNipkow11}. The + main purpose of this paper is to show that a central result about regular + languages---the Myhill-Nerode theorem---can be recreated by only using + regular expressions. This theorem gives necessary and sufficient conditions + for when a language is regular. As a corollary of this theorem we can easily + establish the usual closure properties, including complementation, for + regular languages.\medskip \noindent - {\bf Contributions:} There is an extensive literature on regular - languages. To our best knowledge, our proof of the Myhill-Nerode theorem is - the first that is based on regular expressions, only. The part of this - theorem stating that finitely many partitions imply regularity of the - language is proved by an argument about solving equational sytems. This - argument appears to be folklore. For the other part, we give two proofs: one - direct proof using certain tagging-functions, and another indirect proof - using Antimirov's partial derivatives \cite{Antimirov95}. Again to our best - knowledge, the tagging-functions have not been used before to establish the - Myhill-Nerode theorem. Derivatives of regular expressions have been used - extensively in the literature, unlike partial derivatives. However, partial - derivatives are more suitable in the context of the Myhill-Nerode theorem, - since it is easier to establish formally their finiteness result. + {\bf Contributions:} There is an extensive literature on regular languages. + To our best knowledge, our proof of the Myhill-Nerode theorem is the first + that is based on regular expressions, only. The part of this theorem stating + that finitely many partitions imply regularity of the language is proved by + an argument about solving equational sytems. This argument appears to be + folklore. For the other part, we give two proofs: one direct proof using + certain tagging-functions, and another indirect proof using Antimirov's + partial derivatives \cite{Antimirov95}. Again to our best knowledge, the + tagging-functions have not been used before to establish the Myhill-Nerode + theorem. Derivatives of regular expressions have been used widely in the + literature about regular expressions. However, partial derivatives are more + suitable in the context of the Myhill-Nerode theorem, since it is easier to + establish formally their finiteness result. + *} section {* Preliminaries *} @@ -1544,22 +1557,24 @@ derivations (an example is given in \cite[Page~141]{Sakarovitch09}). Reasoning modulo ACI can be done, but it is very painful in a theorem prover. + Fortunately, there is a much simpler approach using partial + derivatives introduced by Antimirov \cite{Antimirov95}. - in order to prove the second - direction of the Myhill-Nerode theorem. There he calculates the - derivatives for regular expressions and shows that for every - language there can be only finitely many of them %derivations (if - regarded equal modulo ACI). We could have used as tagging-function - the set of derivatives of a regular expression with respect to a - language. Using the fact that two strings are Myhill-Nerode related - whenever their derivative is the same, together with the fact that - there are only finitely such derivatives would give us a similar - argument as ours. However it seems not so easy to calculate the set - of derivatives modulo ACI. Therefore we preferred our direct method - of using tagging-functions. - - The problem of finiteness modulo ACI can be avoided by using partial - derivatives introduced by Antimirov \cite{Antimirov}. + \begin{center} + \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}} + @{thm (lhs) pder.simps(1)} & @{text "\"} & @{thm (rhs) pder.simps(1)}\\ + @{thm (lhs) pder.simps(2)} & @{text "\"} & @{thm (rhs) pder.simps(2)}\\ + @{thm (lhs) pder.simps(3)[where c'="d"]} & @{text "\"} & @{thm (rhs) pder.simps(3)[where c'="d"]}\\ + @{thm (lhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} + & @{text "\"} & @{thm (rhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\ + @{thm (lhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} + & @{text "\"}\\ + \multicolumn{3}{@ {\hspace{20mm}}l@ {}}{@{thm (rhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}}\\ + @{thm (lhs) pder.simps(6)} & @{text "\"} & @{thm (rhs) pder.simps(6)}\smallskip\\ + @{thm (lhs) pders.simps(1)} & @{text "\"} & @{thm (rhs) pders.simps(1)}\\ + @{thm (lhs) pders.simps(2)} & @{text "\"} & @{thm (rhs) pders.simps(2)}\\ + \end{tabular} + \end{center} *}