--- a/Journal/Paper.thy Tue Aug 09 22:14:41 2011 +0000
+++ b/Journal/Paper.thy Tue Aug 09 22:15:11 2011 +0000
@@ -249,7 +249,7 @@
pairs. Using this definition for disjoint union means we do not have a
single type for automata. As a result we will not be able to define a regular
language as one for which there exists an automaton that recognises all its
- strings. This is because we cannot make a definition in HOL that is polymorphic in
+ strings (Definition~\ref{baddef}). This is because we cannot make a definition in HOL that is polymorphic in
the state type and there is no type quantification available in HOL (unlike
in Coq, for example).\footnote{Slind already pointed out this problem in an email
to the HOL4 mailing list on 21st April 2005.}
@@ -305,21 +305,21 @@
larger formalisations of automata theory are carried out in Nuprl
\cite{Constable00} and in Coq \cite{Filliatre97}.
- Also one might consider automata theory as a well-worn stock subject where
- everything is crystal clear. However, paper proofs about automata often
- involve subtle side-conditions which are easily overlooked, but which make
- formal reasoning rather painful. For example Kozen's proof of the
- Myhill-Nerode theorem requires that automata do not have inaccessible
+ Also one might consider automata theory and regular languages as a well-worn
+ stock subject where everything is crystal clear. However, paper proofs about
+ automata often involve subtle side-conditions which are easily overlooked,
+ but which make formal reasoning rather painful. For example Kozen's proof of
+ the Myhill-Nerode theorem requires that automata do not have inaccessible
states \cite{Kozen97}. Another subtle side-condition is completeness of
- automata, that is automata need to have total transition functions and at most one
- `sink' state from which there is no connection to a final state (Brzozowski
- mentions this side-condition in the context of state complexity
- of automata \cite{Brzozowski10}). Such side-conditions mean that if we define a regular
- language as one for which there exists \emph{a} finite automaton that
- recognises all its strings (see Def.~\ref{baddef}), then we need a lemma which
- ensures that another equivalent one can be found satisfying the
- side-condition. Unfortunately, such `little' and `obvious' lemmas make
- a formalisation of automata theory a hair-pulling experience.
+ automata, that is automata need to have total transition functions and at
+ most one `sink' state from which there is no connection to a final state
+ (Brzozowski mentions this side-condition in the context of state complexity
+ of automata \cite{Brzozowski10}). Such side-conditions mean that if we
+ define a regular language as one for which there exists \emph{a} finite
+ automaton that recognises all its strings (see Def.~\ref{baddef}), then we
+ need a lemma which ensures that another equivalent one can be found
+ satisfying the side-condition. Unfortunately, such `little' and `obvious'
+ lemmas make a formalisation of automata theory a hair-pulling experience.
In this paper, we will not attempt to formalise automata theory in
@@ -361,11 +361,11 @@
partial derivatives \cite{Antimirov95}. Again to our best knowledge, the
tagging-functions have not been used before to establish the Myhill-Nerode
theorem. Derivatives of regular expressions have been used recently quite
- widely in the literature; partial derivatives, in contrast, attracted much
+ widely in the literature; partial derivatives, in contrast, attract much
less attention. However, partial derivatives are more suitable in the
context of the Myhill-Nerode theorem, since it is easier to establish
- formally their finiteness result. We have not found any proof that uses
- either of them in order to prove the Myhill-Nerode theorem.
+ formally their finiteness result. We are not aware of any proof that uses
+ either of them for proving the Myhill-Nerode theorem.
*}
section {* Preliminaries *}
@@ -412,7 +412,10 @@
\noindent
In @{text "(ii)"} we use the notation @{term "length s"} for the length of a
string; this property states that if \mbox{@{term "[] \<notin> A"}} then the lengths of
- the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}. We omit
+ the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}.
+ Property @{text "(iv)"} states that a non-empty string in @{term "A\<star>"} can
+ always be split up into a non-empty prefix belonging to @{text "A"} and the
+ rest being in @{term "A\<star>"}. We omit
the proofs for these properties, but invite the reader to consult our
formalisation.\footnote{Available at \url{http://www4.in.tum.de/~urbanc/regexp.html}}
@@ -511,7 +514,15 @@
\noindent
holds, whereby @{text "\<calL> ` rs"} stands for the
- image of the set @{text rs} under function @{text "\<calL>"}.
+ image of the set @{text rs} under function @{text "\<calL>"} defined as
+
+ \begin{center}
+ @{term "lang ` rs \<equiv> {lang r | r. r \<in> rs}"}
+ \end{center}
+
+ \noindent
+ In what follows we shall use this convenient short-hand notation for images of sets
+ also with other functions.
*}
@@ -1147,7 +1158,7 @@
\end{equation}
\noindent
- The relation @{term "\<approx>(lang r)"} partitions the set of all strings into some
+ The relation @{term "\<approx>(lang r)"} partitions the set of all strings, @{term UNIV}, into some
equivalence classes. To show that there are only finitely many of them, it
suffices to show in each induction step that another relation, say @{text
R}, has finitely many equivalence classes and refines @{term "\<approx>(lang r)"}.
@@ -1272,7 +1283,7 @@
\begin{lmm}\label{refinement}
The relation @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} refines @{term "\<approx>A"}, provided for
- all strings @{text x}, @{text y} and @{text z} we have \mbox{@{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y"}}
+ all strings @{text x}, @{text y} and @{text z} we have that \mbox{@{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y"}}
and @{term "x @ z \<in> A"} imply @{text "y @ z \<in> A"}.
\end{lmm}
@@ -1280,8 +1291,8 @@
\noindent
We therefore can analyse how the strings @{text "x @ z"} are in the language
@{text A} and then construct an appropriate tagging-function to infer that
- @{term "y @ z"} are also in @{text A}. For this we sill need the notion of
- the set of all possible \emph{partitions} of a string
+ @{term "y @ z"} are also in @{text A}. For this we will use the notion of
+ the set of all possible \emph{partitions} of a string:
\begin{equation}
@{thm Partitions_def}
@@ -1290,7 +1301,7 @@
\noindent
If we know that @{text "(x\<^isub>p, x\<^isub>s) \<in> Partitions x"}, we will
refer to @{text "x\<^isub>p"} as the \emph{prefix} of the string @{text x},
- respectively to @{text "x\<^isub>s"} as the \emph{suffix}.
+ and respectively to @{text "x\<^isub>s"} as the \emph{suffix}.
Now assuming @{term "x @ z \<in> A \<cdot> B"} there are only two possible ways of how to `split'
@@ -1531,12 +1542,12 @@
\end{center}
\noindent
- From this we know there exist partitions @{text "y\<^isub>p"} and @{text
+ From this we know there exist a partition @{text "y\<^isub>p"} and @{text
"y\<^isub>s"} with @{term "y\<^isub>p \<in> A\<star>"} and also @{term "x\<^isub>s \<approx>A
y\<^isub>s"}. Unfolding the Myhill-Nerode relation we know @{term
"y\<^isub>s @ z\<^isub>a \<in> A"}. We also know that @{term "z\<^isub>b \<in> A\<star>"}.
Therefore @{term "y\<^isub>p @ (y\<^isub>s @ z\<^isub>a) @ z\<^isub>b \<in>
- A\<star>"}, which means @{term "y @ z \<in> A\<star>"}. As the last step we have to set
+ A\<star>"}, which means @{term "y @ z \<in> A\<star>"}. The last step is to set
@{text "A"} to @{term "lang r"} and thus complete the proof.
\end{proof}
*}
@@ -1550,7 +1561,7 @@
a more refined relation than @{term "\<approx>(lang r)"} for which we can
show that there are only finitely many equivalence classes. So far we
showed this by induction on @{text "r"}. However, there is also
- an indirect method to come up with such a refined relation based on
+ an indirect method to come up with such a refined relation by using
derivatives of regular expressions \cite{Brzozowski64}.
Assume the following two definitions for a \emph{left-quotient} of a language,
@@ -1565,14 +1576,14 @@
\end{center}
\noindent
- In order to aid readability, we shall also make use of the following abbreviation:
+ In order to aid readability, we shall also make use of the following abbreviation
\begin{center}
- @{abbrev "Derss s A"}
+ @{abbrev "Derss s As"}
\end{center}
-
\noindent
+ where we apply the left-quotient to a set of languages and then combine the results.
Clearly we have the following relation between the Myhill-Nerode relation
(Def.~\ref{myhillneroderel}) and left-quotients
@@ -1581,7 +1592,7 @@
\end{equation}
\noindent
- It is straightforward to establish the following properties for left-quotients:
+ It is also straightforward to establish the following properties for left-quotients:
\begin{equation}
\mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{2mm}}l}
@@ -1600,8 +1611,8 @@
\noindent
where @{text "\<Delta>"} is a function that tests whether the empty string
- is in the language and returns @{term "{[]}"} or @{term "{}"}, respectively.
- The only interesting case above is the last one where we use Prop.~\ref{langprops}
+ is in the language and returns @{term "{[]}"} or @{term "{}"}, accordingly.
+ The only interesting case above is the last one where we use Prop.~\ref{langprops}@{text "(i)"}
in order to infer that @{term "Der c (A\<star>) = Der c (A \<cdot> A\<star>)"}. We can
then complete the proof by observing that @{term "Delta A \<cdot> Der c (A\<star>) \<subseteq> (Der c A) \<cdot> A\<star>"}.
@@ -1629,8 +1640,8 @@
\end{center}
\noindent
- The last two clauses extend derivatives for characters to strings (list of
- characters). The list-cons operator is written \mbox{@{text "_ :: _"}}. The
+ The last two clauses extend derivatives from characters to strings---i.e.~list of
+ characters. The list-cons operator is written \mbox{@{text "_ :: _"}}. The
function @{term "nullable r"} needed in the @{const Times}-case tests
whether a regular expression can recognise the empty string:
@@ -1652,7 +1663,7 @@
\end{center}
\noindent
- By induction on the regular expression @{text r}, respectively the string @{text s},
+ By induction on the regular expression @{text r}, respectively on the string @{text s},
one can easily show that left-quotients and derivatives relate as follows
\cite{Sakarovitch09}:
@@ -1667,7 +1678,7 @@
The importance in the context of the Myhill-Nerode theorem is that
we can use \eqref{mhders} and \eqref{Dersders} in order to
establish that @{term "x \<approx>(lang r) y"} is equivalent to
- @{term "lang (ders x r) = lang (ders y r)"}. From this we obtain
+ @{term "lang (ders x r) = lang (ders y r)"}. Hence
\begin{equation}
@{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "ders x r = ders y r"}
@@ -1677,10 +1688,10 @@
\noindent
which means the right-hand side (seen as relation) refines the
Myhill-Nerode relation. Consequently, we can use
- @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as a potential tagging-relation
+ @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as a tagging-relation
for the regular expression @{text r}. However, in
- order to be useful in the Myhill-Nerode theorem, we also have to show that
- for the corresponding language there are only finitely many derivatives---ensuring
+ order to be useful for teh second part of the Myhill-Nerode theorem, we also have to show that
+ for the corresponding language there are only finitely many derivatives---thus ensuring
that there are only finitely many equivalence classes. Unfortunately, this
is not true in general. Sakarovitch gives an example where a regular
expression has infinitely many derivatives w.r.t.~a language
@@ -1700,8 +1711,8 @@
\noindent
Carrying this idea through, we must not consider the set of all derivatives,
- but the ones modulo @{text "ACI"}. In principle, this can be formally
- defined, but it is very painful in a theorem prover (since there is no
+ but the ones modulo @{text "ACI"}. In principle, this can be done formally,
+ but it is very painful in a theorem prover (since there is no
direct characterisation of the set of dissimlar derivatives).
@@ -1742,7 +1753,7 @@
expressions. Note that in the last clause we first build the set of partial
derivatives w.r.t~the character @{text c}, then build the image of this set under the
function @{term "pders s"} and finally `union up' all resulting sets. It will be
- convenient to introduce the following abbreviation
+ convenient to introduce for this the following abbreviation
\begin{center}
@{abbrev "pderss s A"}
@@ -1763,7 +1774,7 @@
equal sets. Antimirov \cite{Antimirov95} showed a similar result to
\eqref{Dersders} for partial derivatives:
- \begin{equation}
+ \begin{equation}\label{Derspders}
\mbox{\begin{tabular}{lc}
@{text "(i)"} & @{thm Der_pder}\\
@{text "(ii)"} & @{thm Ders_pders}
@@ -1787,7 +1798,7 @@
\begin{tabular}{r@ {\hspace{1.5mm}}c@ {\hspace{1.5mm}}ll}
@{term "Ders (c # s) (lang r)"}
& @{text "="} & @{term "Ders s (Der c (lang r))"} & by def.\\
- & @{text "="} & @{term "Ders s (\<Union> lang ` (pder c r))"} & by @{text "(i)"}\\
+ & @{text "="} & @{term "Ders s (\<Union> lang ` (pder c r))"} & by @{text "("}\ref{Derspders}@{text ".i)"}\\
& @{text "="} & @{term "\<Union> (Ders s) ` (lang ` (pder c r))"} & by def.~of @{text "Ders"}\\
& @{text "="} & @{term "\<Union> lang ` (\<Union> pders s ` (pder c r))"} & by IH\\
& @{text "="} & @{term "\<Union> lang ` (pders (c # s) r)"} & by def.\\
@@ -1795,12 +1806,29 @@
\end{center}
\noindent
- In order to apply the induction hypothesis in the fourth step, we need the generalisation
- over all regular expressions @{text r}. The case for the empty string is routine and omitted.
+ Note that in order to apply the induction hypothesis in the fourth equation, we
+ need the generalisation over all regular expressions @{text r}. The case for
+ the empty string is routine and omitted.
\end{proof}
- Antimirov also proved that for every language and regular expression there are only finitely
- many partial derivatives.
+ \noindent
+ Taking \eqref{Dersders} and \eqref{Derspders} together gives the relationship
+ between languages of derivatives and partial derivatives
+
+ \begin{equation}
+ \mbox{\begin{tabular}{lc}
+ @{text "(i)"} & @{thm der_pder[symmetric]}\\
+ @{text "(ii)"} & @{thm ders_pders[symmetric]}
+ \end{tabular}}
+ \end{equation}
+
+ \noindent
+ These two properties confirm the observation made earlier
+ that by using sets, partial derivatives have the @{text "ACI"}-identidies
+ of derivatives already built in.
+
+ Antimirov also proved that for every language and regular expression
+ there are only finitely many partial derivatives.
*}
section {* Closure Properties *}