--- a/Journal/Paper.thy Wed Sep 07 18:17:56 2011 +0000
+++ b/Journal/Paper.thy Thu Sep 08 15:08:02 2011 +0000
@@ -320,7 +320,7 @@
changes the type---the disjoint union is not a set, but a set of
pairs. Using this definition for disjoint union means we do not have a
single type for the states of automata. As a result we will not be able to
- define in our fomalisation a regular language as one for which there exists
+ define a regular language as one for which there exists
an automaton that recognises all its strings (Definition~\ref{baddef}). This
is because we cannot make a definition in HOL that is only polymorphic in
the state type, but not in the predicate for regularity; and there is no
@@ -386,7 +386,7 @@
establishing properties about regular languages. However, paper proofs
about automata often involve subtle side-conditions which are easily
overlooked, but which make formal reasoning rather painful. For example
- Kozen's proof of the Myhill-Nerode theorem requires that automata do not
+ Kozen's proof of the Myhill-Nerode Theorem requires that automata do not
have inaccessible states \cite{Kozen97}. Another subtle side-condition is
completeness of automata, that is automata need to have total transition
functions and at most one `sink' state from which there is no connection to
@@ -423,17 +423,17 @@
matching based on derivatives \cite{OwensSlind08} and with an equivalence
checker for regular expressions in Isabelle/HOL \cite{KraussNipkow11}. The
main purpose of this paper is to show that a central result about regular
- languages---the Myhill-Nerode theorem---can be recreated by only using
+ languages---the Myhill-Nerode Theorem---can be recreated by only using
regular expressions. This theorem gives necessary and sufficient conditions
for when a language is regular. As a corollary of this theorem we can easily
establish the usual closure properties, including complementation, for
regular languages. We use the continuation lemma \cite{Rosenberg06},
- which is also a corollary of the Myhill-Nerode theorem, for establishing
+ which is also a corollary of the Myhill-Nerode Theorem, for establishing
the non-regularity of the language @{text "a\<^isup>nb\<^isup>n"}.\medskip
\noindent
{\bf Contributions:} There is an extensive literature on regular languages.
- To our best knowledge, our proof of the Myhill-Nerode theorem is the first
+ To our best knowledge, our proof of the Myhill-Nerode Theorem is the first
that is based on regular expressions, only. The part of this theorem stating
that finitely many partitions imply regularity of the language is proved by
an argument about solving equational systems. This argument appears to be
@@ -441,12 +441,12 @@
certain tagging-functions, and another indirect proof using Antimirov's
partial derivatives \cite{Antimirov95}. Again to our best knowledge, the
tagging-functions have not been used before for establishing the Myhill-Nerode
- theorem. Derivatives of regular expressions have been used recently quite
+ Theorem. Derivatives of regular expressions have been used recently quite
widely in the literature; partial derivatives, in contrast, attract much
less attention. However, partial derivatives are more suitable in the
- context of the Myhill-Nerode theorem, since it is easier to establish
+ context of the Myhill-Nerode Theorem, since it is easier to establish
formally their finiteness result. We are not aware of any proof that uses
- either of them for proving the Myhill-Nerode theorem.
+ either of them for proving the Myhill-Nerode Theorem.
*}
section {* Preliminaries *}
@@ -614,8 +614,8 @@
text {*
\noindent
- The key definition in the Myhill-Nerode theorem is the
- \emph{Myhill-Nerode relation}, which states that w.r.t.~a language two
+ The key definition in the Myhill-Nerode Theorem is the
+ \emph{Myhill-Nerode Relation}, which states that w.r.t.~a language two
strings are related, provided there is no distinguishing extension in this
language. This can be defined as a tertiary relation.
@@ -644,7 +644,7 @@
\end{tabular}
\end{center}
- One direction of the Myhill-Nerode theorem establishes
+ One direction of the Myhill-Nerode Theorem establishes
that if there are finitely many equivalence classes, like in the example above, then
the language is regular. In our setting we therefore have to show:
@@ -732,7 +732,7 @@
single `initial' state in the equational system, which is different from
the method by Brzozowski \cite{Brzozowski64}, where he marks the
`terminal' states. We are forced to set up the equational system in our
- way, because the Myhill-Nerode relation determines the `direction' of the
+ way, because the Myhill-Nerode Relation determines the `direction' of the
transitions---the successor `state' of an equivalence class @{text Y} can
be reached by adding a character to the end of @{text Y}. This is also the
reason why we have to use our reversed version of Arden's Lemma.}
@@ -1152,7 +1152,7 @@
\noindent
Lemma~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
- of the Myhill-Nerode theorem.
+ of the Myhill-Nerode Theorem.
\begin{proof}[Proof of Theorem~\ref{myhillnerodeone}]
By Lemma~\ref{every_eqcl_has_reg} we know that there exists a regular expression for
@@ -1183,7 +1183,7 @@
text {*
\noindent
In this section we will give a proof for establishing the second
- part of the Myhill-Nerode theorem. It can be formulated in our setting as follows:
+ part of the Myhill-Nerode Theorem. It can be formulated in our setting as follows:
\begin{thrm}\label{myhillnerodetwo}
Given @{text "r"} is a regular expression, then @{thm Myhill_Nerode2}.
@@ -1259,7 +1259,7 @@
\end{dfntn}
\noindent
- For constructing @{text R}, will rely on some \emph{tagging-functions}
+ For constructing @{text R}, we will rely on some \emph{tagging-functions}
defined over strings. Given the inductive hypothesis, it will be easy to
prove that the \emph{range} of these tagging-functions is finite. The range
of a function @{text f} is defined as
@@ -1498,7 +1498,7 @@
"z\<^isub>s"} such that @{term "x @ z\<^isub>p \<in> A"} and @{text "z\<^isub>s
\<in> B"}. By the assumption about @{term "tag_Times A B"} we have @{term "\<approx>A
`` {x} = \<approx>A `` {y}"} and thus @{term "x \<approx>A y"}. Hence by the Myhill-Nerode
- relation @{term "y @ z\<^isub>p \<in> A"} holds. Using @{text "z\<^isub>s \<in> B"},
+ Relation @{term "y @ z\<^isub>p \<in> A"} holds. Using @{text "z\<^isub>s \<in> B"},
we can conclude in this case with @{term "y @ z \<in> A \<cdot> B"} (recall @{text
"z\<^isub>p @ z\<^isub>s = z"}).
@@ -1519,7 +1519,7 @@
\noindent
This means there must be a partition @{text "y\<^isub>p"} and @{text "y\<^isub>s"}
such that @{term "y\<^isub>p \<in> A"} and @{term "\<approx>B `` {x\<^isub>s} = \<approx>B ``
- {y\<^isub>s}"}. Unfolding the Myhill-Nerode relation and together with the
+ {y\<^isub>s}"}. Unfolding the Myhill-Nerode Relation and together with the
facts that @{text "x\<^isub>p \<in> A"} and \mbox{@{text "x\<^isub>s @ z \<in> B"}}, we
obtain @{term "y\<^isub>p \<in> A"} and @{text "y\<^isub>s @ z \<in> B"}, as needed in
this case. We again can complete the @{const TIMES}-case by setting @{text
@@ -1634,7 +1634,7 @@
\noindent
From this we know there exist a partition @{text "y\<^isub>p"} and @{text
"y\<^isub>s"} with @{term "y\<^isub>p \<in> A\<star>"} and also @{term "x\<^isub>s \<approx>A
- y\<^isub>s"}. Unfolding the Myhill-Nerode relation we know @{term
+ y\<^isub>s"}. Unfolding the Myhill-Nerode Relation we know @{term
"y\<^isub>s @ z\<^isub>a \<in> A"}. We also know that @{term "z\<^isub>b \<in> A\<star>"}.
Therefore @{term "y\<^isub>p @ (y\<^isub>s @ z\<^isub>a) @ z\<^isub>b \<in>
A\<star>"}, which means @{term "y @ z \<in> A\<star>"}. The last step is to set
@@ -1647,7 +1647,7 @@
text {*
\noindent
As we have seen in the previous section, in order to establish
- the second direction of the Myhill-Nerode theorem, it is sufficient to find
+ the second direction of the Myhill-Nerode Theorem, it is sufficient to find
a more refined relation than @{term "\<approx>(lang r)"} for which we can
show that there are only finitely many equivalence classes. So far we
showed this directly by induction on @{text "r"} using tagging-functions.
@@ -1674,7 +1674,7 @@
\noindent
where we apply the left-quotient to a set of languages and then combine the results.
- Clearly we have the following equivalence between the Myhill-Nerode relation
+ Clearly we have the following equivalence between the Myhill-Nerode Relation
(Definition~\ref{myhillneroderel}) and left-quotients
\begin{equation}\label{mhders}
@@ -1764,7 +1764,7 @@
\end{equation}
\noindent
- The importance of this fact in the context of the Myhill-Nerode theorem is that
+ The importance of this fact in the context of the Myhill-Nerode Theorem is that
we can use \eqref{mhders} and \eqref{Dersders} in order to
establish that
@@ -1783,10 +1783,10 @@
\noindent
This means the right-hand side (seen as a relation) refines the Myhill-Nerode
- relation. Consequently, we can use @{text
+ Relation. Consequently, we can use @{text
"\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as a
tagging-relation. However, in order to be useful for the second part of the
- Myhill-Nerode theorem, we have to be able to establish that for the
+ Myhill-Nerode Theorem, we have to be able to establish that for the
corresponding language there are only finitely many derivatives---thus
ensuring that there are only finitely many equivalence
classes. Unfortunately, this is not true in general. Sakarovitch gives an
@@ -1978,7 +1978,7 @@
the details.
Let us now return to our proof for the second direction in the Myhill-Nerode
- theorem. The point of the above calculations is to use
+ Theorem. The point of the above calculations is to use
@{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. pders x r)\<^esub>"} as tagging-relation.
@@ -2007,7 +2007,7 @@
which we know is finite by Theorem~\ref{antimirov}. Consequently there
are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. pders x r)\<^esub>"}.
This relation refines @{term "\<approx>(lang r)"}, and therefore we can again conclude the
- second part of the Myhill-Nerode theorem.
+ second part of the Myhill-Nerode Theorem.
\end{proof}
*}
@@ -2021,7 +2021,7 @@
More interesting in our setting is the closure under complement, because it seems difficult
to construct a regular expression for the complement language by direct
means. However the existence of such a regular expression can now be easily
- proved using both parts of the Myhill-Nerode theorem, since
+ proved using both parts of the Myhill-Nerode Theorem, since
\begin{center}
@{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"}
@@ -2150,7 +2150,7 @@
\noindent
Our proof follows the one given in \cite[Pages 92--95]{Shallit08}, except that we use
Higman's Lemma, which is already proved in the Isabelle/HOL library
- \cite{Berghofer03}.\footnote{Unfortunately, Berghofer's formalisation of Higman's lemma
+ \cite{Berghofer03}.\footnote{Unfortunately, Berghofer's formalisation of Higman's Lemma
is restricted to 2-letter alphabets,
which means also our formalisation of Theorem~\ref{subseqreg} is `tainted' with
this constraint. However our methodology is applicable to any alphabet of finite size.}
@@ -2270,7 +2270,7 @@
must be regular.
\end{proof}
- Finally we like to show that the Myhill-Nerode theorem is also convenient for establishing
+ Finally we like to show that the Myhill-Nerode Theorem is also convenient for establishing
the non-regularity of languages. For this we use the following version of the Continuation
Lemma (see for example~\cite{Rosenberg06}).
@@ -2281,11 +2281,11 @@
\end{lmm}
\noindent
- This lemma can be easily deduced from the Myhill-Nerode theorem and the Pigeonhole
+ This lemma can be easily deduced from the Myhill-Nerode Theorem and the Pigeonhole
Principle: Since @{text A} is regular, there can be only finitely many
equivalence classes. Hence an infinite set must contain
at least two strings that are in the same equivalence class, that is
- they need to be related by the Myhill-Nerode relation.
+ they need to be related by the Myhill-Nerode Relation.
Using this lemma, it is straightforward to establish that the language
\mbox{@{text "A \<equiv> \<Union>\<^isub>n a\<^sup>n @ b\<^sup>n"}} is not regular (@{text "a\<^sup>n"} stands
@@ -2305,8 +2305,10 @@
\end{proof}
\noindent
- To conclude, this lemma and the Continuation Lemma leads to a contradiction assuming @{text A}
- is regular. Therefore the language @{text A} is not regular, as we wanted to show.
+ To conclude the proof on non-regularity of language @{text A}, the
+ Continuation Lemma and the lemma above lead to a contradiction assuming
+ @{text A} is regular. Therefore the language @{text A} is not regular, as we
+ wanted to show.
*}
@@ -2320,7 +2322,7 @@
expressions can conveniently be defined as a datatype in HOL-based theorem
provers. For us it was therefore interesting to find out how far we can push
this point of view. We have established in Isabelle/HOL both directions
- of the Myhill-Nerode theorem.
+ of the Myhill-Nerode Theorem.
%
\begin{thrm}[The Myhill-Nerode Theorem]\mbox{}\\
A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
@@ -2330,16 +2332,16 @@
Having formalised this theorem means we pushed our point of view quite
far. Using this theorem we can obviously prove when a language is \emph{not}
regular---by establishing that it has infinitely many equivalence classes
- generated by the Myhill-Nerode relation (this is usually the purpose of the
+ generated by the Myhill-Nerode Relation (this is usually the purpose of the
Pumping Lemma \cite{Kozen97}). We can also use it to establish the standard
textbook results about closure properties of regular languages. Interesting
is the case of closure under complement, because it seems difficult to
construct a regular expression for the complement language by direct
means. However the existence of such a regular expression can be easily
- proved using the Myhill-Nerode theorem.
+ proved using the Myhill-Nerode Theorem.
- Our insistence on regular expressions for proving the Myhill-Nerode theorem
- arose from the limitations of HOL, used in the popular theorem provers HOL4,
+ Our insistence on regular expressions for proving the Myhill-Nerode Theorem
+ arose from the limitations of HOL, which is the logic underlying the popular theorem provers HOL4,
HOLlight and Isabelle/HOL. In order to guarantee consistency,
formalisations in HOL can only extend the logic with definitions that introduce a new concept in
terms of already existing notions. A convenient definition for automata
@@ -2347,7 +2349,7 @@
us to use the standard operation for disjoint union whenever we need to compose two
automata. Unfortunately, we cannot use such a polymorphic definition
in HOL as part of the definition for regularity of a language (a predicate
- over set of strings). Consider for example the following attempt:
+ over sets of strings). Consider for example the following attempt:
\begin{center}
@{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_dfa (M) \<and> \<calL>(M) = A"}
@@ -2355,8 +2357,8 @@
\noindent
In this definifion, the definiens is polymorphic in the type of the automata
- @{text "M"} (indicated by dependency on @{text "\<alpha>"}), but the definiendum
- @{text "is_regular"} is not. Such definitions are excluded in HOL, because
+ @{text "M"} (indicated by dependency on the type-variable @{text "\<alpha>"}), but the definiendum
+ @{text "is_regular"} is not. Such definitions are excluded from HOL, because
they can lead easily to inconsistencies (see \cite{PittsHOL4} for a simple
example). Also HOL does not contain type-quantifiers which would allow us to
get rid of the polymorphism by quantifying over the type-variable @{text
@@ -2370,7 +2372,7 @@
infrastructure comes for free. The definition of regularity in terms of
regular expressions poses no problem at all for HOL. We showed in this
paper that they can be used for establishing the central result in regular
- language theory---the Myhill-Nerode theorem.
+ language theory---the Myhill-Nerode Theorem.
While regular expressions are convenient, they have some limitations. One is
that there seems to be no method of calculating a minimal regular expression
@@ -2388,7 +2390,7 @@
\cite{Brzozowski64}. The close connection can be seen by considering the
equivalence classes as the states of the minimal automaton for the regular
language. However there are some subtle differences. Because our equivalence
- classes (or correspondingly states) arise from the Myhil-Nerode Relation, the most natural
+ classes (or correspondingly states) arise from the Myhill-Nerode Relation, the most natural
choice is to characterise each state with the set of strings starting from
the initial state leading up to that state. Usually, however, the states are
characterised as the strings starting from that state leading to the
@@ -2397,16 +2399,16 @@
state', while Brzozowski has it on the terminal states. This means we also
need to reverse the direction of Arden's Lemma. We have not found anything
in the literature about our way of proving the first direction of the
- Myhill-Nerode theorem, but it appears to be folklore.
+ Myhill-Nerode Theorem, but it appears to be folklore.
We presented two proofs for the second direction of the Myhill-Nerode
- theorem. One direct proof using tagging-functions and another using partial
+ Theorem. One direct proof using tagging-functions and another using partial
derivatives. This part of our work is where our method using regular
expressions shines, because we can completely side-step the standard
argument \cite{Kozen97} where automata need to be composed. However, it is
also the direction where we had to spend most of the `conceptual' time, as
our first proof based on tagging-functions is new for establishing the
- Myhill-Nerode theorem. All standard proofs of this direction proceed by
+ Myhill-Nerode Theorem. All standard proofs of this direction proceed by
arguments over automata.
The indirect proof for the second direction arose from our interest in
@@ -2429,7 +2431,7 @@
algorithm is still executable. Given the existing infrastructure for
executable sets in Isabelle/HOL \cite{Haftmann09}, it should.
- Our formalisation of the Myhill-Nerode theorem consists of 780 lines of
+ Our formalisation of the Myhill-Nerode Theorem consists of 780 lines of
Isabelle/Isar code for the first direction and 460 for the second (the one
based on tagging-functions), plus around 300 lines of standard material
about regular languages. The formalisation of derivatives and partial
@@ -2443,14 +2445,15 @@
use of sets over which the algorithm operates means it is not as easy to
formalise as one might hope. However, it seems sets cannot be avoided since
the `input' of the algorithm consists of equivalence classes and we cannot
- see how to reformulate the theory so that we can use lists. Lists would be
+ see how to reformulate the theory so that we can use lists or matrices. Lists would be
much easier to reason about, since we can define functions over them by
recursion. For sets we have to use set-comprehensions, which is slightly
- unwieldy.
+ unwieldy. Matrices would allow us to use the slick formalisation by
+ Nipkow of the Gauss-Jordan algorithm \cite{Nipkow11}.
While our formalisation might appear large, it should be seen
in the context of the work done by Constable at al \cite{Constable00} who
- formalised the Myhill-Nerode theorem in Nuprl using automata. They write
+ formalised the Myhill-Nerode Theorem in Nuprl using automata. They write
that their four-member team needed something on the magnitude of 18 months
for their formalisation. It is hard to gauge the size of a
formalisation in Nurpl, but from what is shown in the Nuprl Math Library