--- a/Journal/Paper.thy Tue Jul 26 18:12:07 2011 +0000
+++ b/Journal/Paper.thy Wed Jul 27 12:32:28 2011 +0000
@@ -34,6 +34,7 @@
abbreviation "ATOM \<equiv> Atom"
abbreviation "PLUS \<equiv> Plus"
abbreviation "TIMES \<equiv> Times"
+abbreviation "TIMESS \<equiv> Times_set"
abbreviation "STAR \<equiv> Star"
@@ -130,28 +131,36 @@
Regular languages are an important and well-understood subject in Computer
Science, with many beautiful theorems and many useful algorithms. There is a
wide range of textbooks on this subject, many of which are aimed at students
- and contain very detailed `pencil-and-paper' proofs
- (e.g.~\cite{Kozen97, HopcroftUllman69}). It seems natural to exercise theorem provers by
+ and contain very detailed `pencil-and-paper' proofs (e.g.~\cite{Kozen97,
+ HopcroftUllman69}). It seems natural to exercise theorem provers by
formalising the theorems and by verifying formally the algorithms. A
popular choice for a theorem prover would be one based on Higher-Order Logic
- (HOL), for example HOL4, HOLlight and Isabelle/HOL. For our development
- we will use the latter. One distinguishing feature of HOL is it's
- type system, which is based on Church's Simple Theory of Types \cite{Church40}. The
- limitations of this type system are one of the underlying motivations for the
- work presented in this paper.
+ (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development
+ presented in this paper we will use the latter. HOL is a predicate calculus
+ that allows quantification over predicate variables. Its type system is
+ based on Church's Simple Theory of Types \cite{Church40}. Although
+ many mathematical concepts can be conveniently expressed in HOL, there are some
+ limitations that hurt badly, if one attempts a simple-minded formalisation
+ of regular languages in it.
+
+ The typical approach to regular languages is to introduce finite automata
+ and then define everything in terms of them \cite{Kozen97}. For example,
+ a regular language is normally defined as:
- The typical approach to regular languages is to
- introduce finite automata and then define everything in terms of them. For
- example, a regular language is normally defined as one whose strings are
- recognised by a finite deterministic automaton. This approach has many
- benefits. Among them is the fact that it is easy to convince oneself that
- regular languages are closed under complementation: one just has to exchange
- the accepting and non-accepting states in the corresponding automaton to
- obtain an automaton for the complement language. The problem, however, lies
- with formalising such reasoning in a HOL-based theorem prover. Automata are
- built up from states and transitions that need to be represented as graphs,
- matrices or functions, none of which can be defined as an inductive
- datatype.
+ \begin{dfntn}\label{baddef}
+ A language @{text A} is \emph{regular}, provided there is a
+ finite deterministic automaton that recognises all strings of @{text "A"}.
+ \end{dfntn}
+
+ \noindent
+ This approach has many benefits. Among them is the fact that it is easy to
+ convince oneself that regular languages are closed under complementation:
+ one just has to exchange the accepting and non-accepting states in the
+ corresponding automaton to obtain an automaton for the complement language.
+ The problem, however, lies with formalising such reasoning in a HOL-based
+ theorem prover. Automata are built up from states and transitions that need
+ to be represented as graphs, matrices or functions, none of which can be
+ defined as an inductive datatype.
In case of graphs and matrices, this means we have to build our own
reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
@@ -231,8 +240,9 @@
pairs. Using this definition for disjoint union means we do not have a
single type for automata. As a result we will not be able to define a regular
language as one for which there exists an automaton that recognises all its
- strings, since there is no type quantification available in HOL (unlike in Coq, for
- example).
+ strings. This is because we cannot make a definition in HOL that is polymorphic in
+ the state type and also there is no type quantification available in HOL (unlike
+ in Coq, for example).
An alternative, which provides us with a single type for automata, is to give every
state node an identity, for example a natural
@@ -285,22 +295,23 @@
larger formalisations of automata theory are carried out in Nuprl
\cite{Constable00} and in Coq \cite{Filliatre97}.
- One might also consider the Myhill-Nerode theorem as well-worn stock
- material where everything is clear. However, paper proofs of this theorem
- often involve subtle side-conditions which are easily overlooked, but which
- make formal reasoning rather painful. For example Kozen's proof requires
- that the automata do not have inaccessible states \cite{Kozen97}. Another
- subtle side-condition is completeness of automata:
- automata need to have total transition functions and at most one `sink'
- state from which there is no connection to a final state (Brozowski mentions
- this side-condition in connection with state complexity
+ Also one might consider automata theory as well-worn stock material where
+ everything is crystal clear. However, paper proofs about automata often
+ involve subtle side-conditions which are easily overlooked, but which make
+ formal reasoning rather painful. For example Kozen's proof of the
+ Myhill-Nerode theorem requires that the automata do not have inaccessible
+ states \cite{Kozen97}. Another subtle side-condition is completeness of
+ automata: automata need to have total transition functions and at most one
+ `sink' state from which there is no connection to a final state (Brozowski
+ mentions this side-condition in connection with state complexity
\cite{Brozowski10}). Such side-conditions mean that if we define a regular
- language as one for which there exists \emph{any} finite automaton, then we
- need a lemma which ensures that another equivalent can be found satisfying the
- side-condition. Unfortunately, such `little' and `obvious' lemmas make
- formalisations of results in automata theory hair-pulling experiences.
+ language as one for which there exists \emph{a} finite automaton that
+ recognises all its strings (Def.~\ref{baddef}), then we need a lemma which
+ ensures that another equivalent can be found satisfying the
+ side-condition. Unfortunately, such `little' and `obvious' lemmas make
+ formalisations of automata theory hair-pulling experiences.
-
+
In this paper, we will not attempt to formalise automata theory in
Isabelle/HOL nor will we attempt to formalise automata proofs from the
literature, but take a different approach to regular languages than is
@@ -315,33 +326,35 @@
\noindent
The reason is that regular expressions, unlike graphs, matrices and
- functions, can be easily defined as an inductive datatype. No side-conditions
- will be needed for regular expressions. Moreover, a reasoning infrastructure
- (like induction and recursion) comes for free in HOL-based theorem provers.
- This has recently been exploited in HOL4 with a formalisation of
- regular expression matching based on derivatives \cite{OwensSlind08} and
- with an equivalence checker for regular expressions in Isabelle/HOL
- \cite{KraussNipkow11}. The main purpose of this paper is to show that a central
- result about regular languages---the Myhill-Nerode theorem---can be
- recreated by only using regular expressions. This theorem gives necessary
- and sufficient conditions for when a language is regular. As a corollary of
- this theorem we can easily establish the usual closure properties, including
- complementation, for regular languages.\medskip
+ functions, can be easily defined as an inductive datatype. A reasoning
+ infrastructure (like induction and recursion) comes then for free in
+ HOL. Moreover, no side-conditions will be needed for regular expressions,
+ like we usually need for automata. This convenience of regular expressions has
+ recently been exploited in HOL4 with a formalisation of regular expression
+ matching based on derivatives \cite{OwensSlind08} and with an equivalence
+ checker for regular expressions in Isabelle/HOL \cite{KraussNipkow11}. The
+ main purpose of this paper is to show that a central result about regular
+ languages---the Myhill-Nerode theorem---can be recreated by only using
+ regular expressions. This theorem gives necessary and sufficient conditions
+ for when a language is regular. As a corollary of this theorem we can easily
+ establish the usual closure properties, including complementation, for
+ regular languages.\medskip
\noindent
- {\bf Contributions:} There is an extensive literature on regular
- languages. To our best knowledge, our proof of the Myhill-Nerode theorem is
- the first that is based on regular expressions, only. The part of this
- theorem stating that finitely many partitions imply regularity of the
- language is proved by an argument about solving equational sytems. This
- argument appears to be folklore. For the other part, we give two proofs: one
- direct proof using certain tagging-functions, and another indirect proof
- using Antimirov's partial derivatives \cite{Antimirov95}. Again to our best
- knowledge, the tagging-functions have not been used before to establish the
- Myhill-Nerode theorem. Derivatives of regular expressions have been used
- extensively in the literature, unlike partial derivatives. However, partial
- derivatives are more suitable in the context of the Myhill-Nerode theorem,
- since it is easier to establish formally their finiteness result.
+ {\bf Contributions:} There is an extensive literature on regular languages.
+ To our best knowledge, our proof of the Myhill-Nerode theorem is the first
+ that is based on regular expressions, only. The part of this theorem stating
+ that finitely many partitions imply regularity of the language is proved by
+ an argument about solving equational sytems. This argument appears to be
+ folklore. For the other part, we give two proofs: one direct proof using
+ certain tagging-functions, and another indirect proof using Antimirov's
+ partial derivatives \cite{Antimirov95}. Again to our best knowledge, the
+ tagging-functions have not been used before to establish the Myhill-Nerode
+ theorem. Derivatives of regular expressions have been used widely in the
+ literature about regular expressions. However, partial derivatives are more
+ suitable in the context of the Myhill-Nerode theorem, since it is easier to
+ establish formally their finiteness result.
+
*}
section {* Preliminaries *}
@@ -1544,22 +1557,24 @@
derivations (an example is given in \cite[Page~141]{Sakarovitch09}).
Reasoning modulo ACI can be done, but it is very painful in a theorem prover.
+ Fortunately, there is a much simpler approach using partial
+ derivatives introduced by Antimirov \cite{Antimirov95}.
- in order to prove the second
- direction of the Myhill-Nerode theorem. There he calculates the
- derivatives for regular expressions and shows that for every
- language there can be only finitely many of them %derivations (if
- regarded equal modulo ACI). We could have used as tagging-function
- the set of derivatives of a regular expression with respect to a
- language. Using the fact that two strings are Myhill-Nerode related
- whenever their derivative is the same, together with the fact that
- there are only finitely such derivatives would give us a similar
- argument as ours. However it seems not so easy to calculate the set
- of derivatives modulo ACI. Therefore we preferred our direct method
- of using tagging-functions.
-
- The problem of finiteness modulo ACI can be avoided by using partial
- derivatives introduced by Antimirov \cite{Antimirov}.
+ \begin{center}
+ \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
+ @{thm (lhs) pder.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(1)}\\
+ @{thm (lhs) pder.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(2)}\\
+ @{thm (lhs) pder.simps(3)[where c'="d"]} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(3)[where c'="d"]}\\
+ @{thm (lhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}
+ & @{text "\<equiv>"} & @{thm (rhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
+ @{thm (lhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}
+ & @{text "\<equiv>"}\\
+ \multicolumn{3}{@ {\hspace{20mm}}l@ {}}{@{thm (rhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}}\\
+ @{thm (lhs) pder.simps(6)} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(6)}\smallskip\\
+ @{thm (lhs) pders.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) pders.simps(1)}\\
+ @{thm (lhs) pders.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) pders.simps(2)}\\
+ \end{tabular}
+ \end{center}
*}