--- a/Journal/Paper.thy Mon Aug 15 22:36:26 2011 +0000
+++ b/Journal/Paper.thy Tue Aug 16 10:21:14 2011 +0000
@@ -74,9 +74,9 @@
Der ("Der _ _" [1000, 1000] 100) and
ONE ("ONE" 999) and
ZERO ("ZERO" 999) and
- pders_lang ("pderl") and
+ pders_lang ("pdersl") and
UNIV1 ("UNIV\<^isup>+") and
- Ders_lang ("Derl")
+ Ders_lang ("Dersl")
lemma meta_eq_app:
shows "f \<equiv> \<lambda>x. g x \<Longrightarrow> f x \<equiv> g x"
@@ -159,7 +159,8 @@
The typical approach to regular languages is to
introduce finite automata and then define everything in terms of them
- \cite{Kozen97}. For example, a regular language is normally defined as:
+ \cite{ HopcroftUllman69,Kozen97}. For example, a regular language is
+ normally defined as:
\begin{dfntn}\label{baddef}
A language @{text A} is \emph{regular}, provided there is a
@@ -251,22 +252,21 @@
\noindent
changes the type---the disjoint union is not a set, but a set of
pairs. Using this definition for disjoint union means we do not have a
- single type for automata. As a result we will not be able to define a
+ single type for the states of automata. As a result we will not be able to define a
regular language as one for which there exists an automaton that recognises
all its strings (Definition~\ref{baddef}). This is because we cannot make a
- definition in HOL that is polymorphic in the state type and there is no type
+ definition in HOL that is only polymorphic in the state type and there is no type
quantification available in HOL (unlike in Coq, for example).\footnote{Slind
already pointed out this problem in an email to the HOL4 mailing list on
21st April 2005.}
-
- An alternative, which provides us with a single type for automata, is to give every
- state node an identity, for example a natural
- number, and then be careful to rename these identities apart whenever
- connecting two automata. This results in clunky proofs
- establishing that properties are invariant under renaming. Similarly,
- connecting two automata represented as matrices results in very adhoc
- constructions, which are not pleasant to reason about.
+ An alternative, which provides us with a single type for states of automata,
+ is to give every state node an identity, for example a natural number, and
+ then be careful to rename these identities apart whenever connecting two
+ automata. This results in clunky proofs establishing that properties are
+ invariant under renaming. Similarly, connecting two automata represented as
+ matrices results in very adhoc constructions, which are not pleasant to
+ reason about.
Functions are much better supported in Isabelle/HOL, but they still lead to similar
problems as with graphs. Composing, for example, two non-deterministic automata in parallel
@@ -434,7 +434,7 @@
Central to our proof will be the solution of equational systems
involving equivalence classes of languages. For this we will use Arden's Lemma
- (see \cite[Page 100]{Sakarovitch09}),
+ (see for example \cite[Page 100]{Sakarovitch09}),
which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
@{term "[] \<notin> A"}. However we will need the following `reverse'
version of Arden's Lemma (`reverse' in the sense of changing the order of @{term "A \<cdot> X"} to
@@ -448,7 +448,7 @@
\begin{proof}
For the right-to-left direction we assume @{thm (rhs) arden} and show
- that @{thm (lhs) arden} holds. From Prop.~\ref{langprops}@{text "(i)"}
+ that @{thm (lhs) arden} holds. From Property~\ref{langprops}@{text "(i)"}
we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"},
which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both
sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side
@@ -467,12 +467,12 @@
of @{text "\<star>"}.
For the inclusion in the other direction we assume a string @{text s}
with length @{text k} is an element in @{text X}. Since @{thm (prem 1) arden}
- we know by Prop.~\ref{langprops}@{text "(ii)"} that
+ we know by Property~\ref{langprops}@{text "(ii)"} that
@{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k}
(the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer).
From @{text "(*)"} it follows then that
@{term s} must be an element in @{term "(\<Union>m\<in>{0..k}. B \<cdot> (A \<up> m))"}. This in turn
- implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Prop.~\ref{langprops}@{text "(iii)"}
+ implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Property~\ref{langprops}@{text "(iii)"}
this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show.
\end{proof}
@@ -601,7 +601,7 @@
that matches every string in @{text A}.
- Our proof of Thm.~\ref{myhillnerodeone} relies on a method that can calculate a
+ Our proof of Theorem~\ref{myhillnerodeone} relies on a method that can calculate a
regular expression for \emph{every} equivalence class, not just the ones
in @{term "finals A"}. We
first define the notion of \emph{one-character-transition} between
@@ -733,9 +733,9 @@
\end{lmm}
\noindent
- Our proof of Thm.~\ref{myhillnerodeone} will proceed by transforming the
+ Our proof of Theorem~\ref{myhillnerodeone} will proceed by transforming the
initial equational system into one in \emph{solved form} maintaining the invariant
- in Lem.~\ref{inv}. From the solved form we will be able to read
+ in Lemma~\ref{inv}. From the solved form we will be able to read
off the regular expressions.
In order to transform an equational system into solved form, we have two
@@ -891,16 +891,15 @@
\end{center}
\noindent
- We are not concerned here with the definition of this operator
- (see Berghofer and Nipkow \cite{BerghoferNipkow00}), but note that we eliminate
- in each @{const Iter}-step a single equation, and therefore
- have a well-founded termination order by taking the cardinality
- of the equational system @{text ES}. This enables us to prove
- properties about our definition of @{const Solve} when we `call' it with
- the equivalence class @{text X} and the initial equational system
- @{term "Init (UNIV // \<approx>A)"} from
+ We are not concerned here with the definition of this operator (see
+ Berghofer and Nipkow \cite{BerghoferNipkow00} for example), but note that we
+ eliminate in each @{const Iter}-step a single equation, and therefore have a
+ well-founded termination order by taking the cardinality of the equational
+ system @{text ES}. This enables us to prove properties about our definition
+ of @{const Solve} when we `call' it with the equivalence class @{text X} and
+ the initial equational system @{term "Init (UNIV // \<approx>A)"} from
\eqref{initcs} using the principle:
- %
+
\begin{equation}\label{whileprinciple}
\mbox{\begin{tabular}{l}
@{term "invariant (Init (UNIV // \<approx>A))"} \\
@@ -964,7 +963,7 @@
\begin{proof}
Finiteness is given by the assumption and the way how we set up the
- initial equational system. Soundness is proved in Lem.~\ref{inv}. Distinctness
+ initial equational system. Soundness is proved in Lemma~\ref{inv}. Distinctness
follows from the fact that the equivalence classes are disjoint. The @{text ardenable}
property also follows from the setup of the initial equational system, as does
validity.
@@ -993,7 +992,7 @@
\noindent
Finiteness is straightforward, as the @{const Subst} and @{const Arden} operations
keep the equational system finite. These operations also preserve soundness
- and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
+ and distinctness (we proved soundness for @{const Arden} in Lemma~\ref{ardenable}).
The property @{text ardenable} is clearly preserved because the append-operation
cannot make a regular expression to match the empty string. Validity is
given because @{const Arden} removes an equivalence class from @{text yrhs}
@@ -1032,16 +1031,16 @@
\begin{proof}
In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly
- stronger invariant since Lem.~\ref{iterone} and \ref{itertwo} have the precondition
+ stronger invariant since Lemma~\ref{iterone} and \ref{itertwo} have the precondition
that @{term "(X, rhs) \<in> ES"} for some @{text rhs}. This precondition is needed
in order to choose in the @{const Iter}-step an equation that is not \mbox{@{term "X = rhs"}}.
Therefore our invariant cannot be just @{term "invariant ES"}, but must be
@{term "invariant ES \<and> (\<exists>rhs. (X, rhs) \<in> ES)"}. By assumption
- @{thm (prem 2) Solve} and Lem.~\ref{invzero}, the more general invariant holds for
+ @{thm (prem 2) Solve} and Lemma~\ref{invzero}, the more general invariant holds for
the initial equational system. This is premise 1 of~\eqref{whileprinciple}.
- Premise 2 is given by Lem.~\ref{iterone} and the fact that @{const Iter} might
+ Premise 2 is given by Lemma~\ref{iterone} and the fact that @{const Iter} might
modify the @{text rhs} in the equation @{term "X = rhs"}, but does not remove it.
- Premise 3 of~\eqref{whileprinciple} is by Lem.~\ref{itertwo}. Now in premise 4
+ Premise 3 of~\eqref{whileprinciple} is by Lemma~\ref{itertwo}. Now in premise 4
we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"}
and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"}
does not holds. By the stronger invariant we know there exists such a @{text "rhs"}
@@ -1066,7 +1065,7 @@
and that the invariant holds for this equation. That means we
know @{text "X = \<Union>\<calL> ` rhs"}. We further know that
this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the
- invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
+ invariant and Lemma~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation
removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
@@ -1075,11 +1074,11 @@
\end{proof}
\noindent
- Lem.~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
+ Lemma~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
of the Myhill-Nerode theorem.
- \begin{proof}[of Thm.~\ref{myhillnerodeone}]
- By Lem.~\ref{every_eqcl_has_reg} we know that there exists a regular expression for
+ \begin{proof}[Proof of Theorem~\ref{myhillnerodeone}]
+ By Lemma~\ref{every_eqcl_has_reg} we know that there exists a regular expression for
every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is
a subset of @{term "UNIV // \<approx>A"}, we also know that for every equivalence class
in @{term "finals A"} there exists a regular expression. Moreover by assumption
@@ -1175,11 +1174,10 @@
\end{dfntn}
\noindent
- For constructing @{text R} will
- rely on some \emph{tagging-functions} defined over strings. Given the
- inductive hypothesis, it will be easy to prove that the \emph{range} of
- these tagging-functions is finite. The range of a function @{text f} is
- defined as
+ For constructing @{text R}, will rely on some \emph{tagging-functions}
+ defined over strings. Given the inductive hypothesis, it will be easy to
+ prove that the \emph{range} of these tagging-functions is finite. The range
+ of a function @{text f} is defined as
\begin{center}
@{text "range f \<equiv> f ` UNIV"}
@@ -1225,10 +1223,11 @@
finite by assumption. Now @{term "f (UNIV // =tag=)"} is a subset of @{text "range f"},
and so also finite. Injectivity amounts to showing that @{text "X = Y"} under the
assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}.
- From the assumptions we can obtain @{text "x \<in> X"} and @{text "y \<in> Y"} with
+ From the assumptions we obtain \mbox{@{text "x \<in> X"}} and @{text "y \<in> Y"} with
@{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in
turn means that the equivalence classes @{text X}
- and @{text Y} must be equal.
+ and @{text Y} must be equal. Therefore \eqref{finiteimageD} allows us to conclude
+ with @{thm (concl) finite_eq_tag_rel}.
\end{proof}
\begin{lmm}\label{fintwo}
@@ -1256,7 +1255,7 @@
\end{proof}
\noindent
- Chaining Lem.~\ref{finone} and \ref{fintwo} together, means in order to show
+ Chaining Lemma~\ref{finone} and \ref{fintwo} together, means in order to show
that @{term "UNIV // \<approx>(lang r)"} is finite, we have to construct a tagging-function whose
range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(lang r)"}.
Let us attempt the @{const PLUS}-case first. We take as tagging-function
@@ -1310,7 +1309,7 @@
and respectively to @{text "x\<^isub>s"} as the \emph{suffix}.
- Now assuming @{term "x @ z \<in> A \<cdot> B"} there are only two possible ways of how to `split'
+ Now assuming @{term "x @ z \<in> A \<cdot> B"}, there are only two possible ways of how to `split'
this string to be in @{term "A \<cdot> B"}:
%
\begin{center}
@@ -1391,7 +1390,7 @@
\end{center}
\noindent
- We have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do
+ Note that we have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do
not know anything about how the string @{term x} is partitioned.
With this definition in place, let us prove the @{const "Times"}-case.
@@ -1502,7 +1501,7 @@
There are potentially many such prefixes, but there can only be finitely many of them (the
string @{text x} is finite). Let us therefore choose the longest one and call it
@{text "x\<^bsub>pmax\<^esub>"}. Now for the rest of the string @{text "x\<^isub>s @ z"} we
- know it is in @{term "A\<star>"} and cannot be the empty string. By Prop.~\ref{langprops}@{text "(iv)"},
+ know it is in @{term "A\<star>"} and cannot be the empty string. By Property~\ref{langprops}@{text "(iv)"},
we can separate
this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \<noteq> []"}, @{text "a \<in> A"}
and @{term "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x\<^isub>s"},
@@ -1616,16 +1615,19 @@
\end{equation}
\noindent
- where @{text "\<Delta>"} in the fifth line is a function that tests whether the empty string
- is in the language and returns @{term "{[]}"} or @{term "{}"}, accordingly.
- The only interesting case is the last one where we use Prop.~\ref{langprops}@{text "(i)"}
- in order to infer that @{term "Der c (A\<star>) = Der c (A \<cdot> A\<star>)"}. We can
- then complete the proof by noting that @{term "Delta A \<cdot> Der c (A\<star>) \<subseteq> (Der c A) \<cdot> A\<star>"}.
-
- Brzozowski observed that the left-quotients for languages of regular
- expressions can be calculated directly using the notion of \emph{derivatives
- of a regular expression} \cite{Brzozowski64}. We define this notion in
- Isabelle/HOL as follows:
+ where @{text "\<Delta>"} in the fifth line is a function that tests whether the
+ empty string is in the language and returns @{term "{[]}"} or @{term "{}"},
+ accordingly. In the last equation we use the list-cons operator written
+ \mbox{@{text "_ :: _"}}. The only interesting case is the @{text "A\<star>"}-case
+ where we use Property~\ref{langprops}@{text "(i)"} in order to infer that
+ @{term "Der c (A\<star>) = Der c (A \<cdot> A\<star>)"}. We can then complete the proof by
+ using the fifth equation and noting that @{term "Delta A \<cdot> Der c (A\<star>) \<subseteq> (Der
+ c A) \<cdot> A\<star>"}.
+
+ Brzozowski observed that the left-quotients for languages of
+ regular expressions can be calculated directly using the notion of
+ \emph{derivatives of a regular expression} \cite{Brzozowski64}. We define
+ this notion in Isabelle/HOL as follows:
\begin{center}
\begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
@@ -1646,8 +1648,7 @@
\end{center}
\noindent
- The last two clauses extend derivatives from characters to strings---i.e.~list of
- characters. The list-cons operator is written \mbox{@{text "_ :: _"}}. The
+ The last two clauses extend derivatives from characters to strings. The
boolean function @{term "nullable r"} needed in the @{const Times}-case tests
whether a regular expression can recognise the empty string. It can be defined as
follows.
@@ -1672,7 +1673,7 @@
\noindent
By induction on the regular expression @{text r}, respectively on the string @{text s},
one can easily show that left-quotients and derivatives of regular expressions
- relate as follows (for example \cite{Sakarovitch09}):
+ relate as follows (see for example~\cite{Sakarovitch09}):
\begin{equation}\label{Dersders}
\mbox{\begin{tabular}{c}
@@ -1703,7 +1704,7 @@
classes. Unfortunately, this is not true in general. Sakarovitch gives an
example where a regular expression has infinitely many derivatives
w.r.t.~the language \mbox{@{term "({a} \<cdot> {b})\<star> \<union> ({a} \<cdot> {b})\<star> \<cdot> {a}"}}
- \cite[Page~141]{Sakarovitch09}.
+ (see \cite[Page~141]{Sakarovitch09}).
What Brzozowski \cite{Brzozowski64} established is that for every language there
@@ -1783,7 +1784,7 @@
taking the partial derivatives of the
regular expressions in \eqref{ACI} gives us in each case
equal sets. Antimirov \cite{Antimirov95} showed a similar result to
- \eqref{Dersders} for partial derivatives:
+ \eqref{Dersders} for partial derivatives, namely
\begin{equation}\label{Derspders}
\mbox{\begin{tabular}{lc}
@@ -1855,7 +1856,7 @@
Antimirov's argument first shows this theorem for the language @{term UNIV1},
which is the set of all non-empty strings. For this he proves:
- \begin{equation}
+ \begin{equation}\label{pdersunivone}
\mbox{\begin{tabular}{l}
@{thm pders_lang_Zero}\\
@{thm pders_lang_One}\\
@@ -1883,10 +1884,11 @@
\noindent
and for all languages @{text "A"}, @{thm pders_lang_subset[where B="UNIV",
simplified]} holds. Since we follow Antimirov's proof quite closely in our
- formalisation, we omit the details.
+ formalisation (only the last two cases of \eqref{pdersunivone} involve some
+ non-routine induction argument), we omit the details.
- Let us return to our proof of the second direction in the Myhill-Nerode
+ Let us now return to our proof about the second direction in the Myhill-Nerode
theorem. The point of the above calculations is to use
@{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation.
@@ -1915,7 +1917,8 @@
Now the range of @{term "\<lambda>x. pders x r"} is a subset of @{term "Pow (pders_lang UNIV r)"},
which we know is finite by Theorem~\ref{antimirov}. This means there
are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"},
- and we can again conclude the second part of the Myhill-Nerode theorem.
+ which refines @{term "\<approx>(lang r)"}, and consequently we can again conclude the
+ second part of the Myhill-Nerode theorem.
\end{proof}
*}
@@ -1929,7 +1932,7 @@
More interesting is the closure under complement, because it seems difficult
to construct a regular expression for the complement language by direct
means. However the existence of such a regular expression can now be easily
- proved using the Myhill-Nerode theorem since
+ proved using both parts of the Myhill-Nerode theorem, since
\begin{center}
@{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"}
@@ -1939,7 +1942,7 @@
holds for any strings @{text "s\<^isub>1"} and @{text
"s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"}
give rise to the same partitions. So if one is finite, the other is too, and
- the other way around. Proving the existence of such a regular expression via
+ vice versa. Proving the existence of such a regular expression via
automata using the standard method would be quite involved. It includes the
steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
"\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
@@ -2004,7 +2007,7 @@
a regular expression @{text r} such that @{term "A = lang r"}. We also know
that @{term "pders_lang B r"} is finite for every language @{text B} and
regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition
- and Lemma~\ref{Derspders} therefore
+ and \eqref{Derspders} therefore
\begin{equation}\label{eqq}
@@ -2013,8 +2016,8 @@
\noindent
Since there are only finitely many regular expressions in @{term "pders_lang
- B r"}, we know by \eqref{uplus} that there exists a regular expression that
- the right-hand side of \eqref{eqq} is equal to \mbox{@{term "lang (\<Uplus>(pders_lang B
+ B r"}, we know by \eqref{uplus} that there exists a regular expression so that
+ the right-hand side of \eqref{eqq} is equal to the language \mbox{@{term "lang (\<Uplus>(pders_lang B
r))"}}. Thus the regular expression @{term "\<Uplus>(pders_lang B r)"} verifies that
@{term "Ders_lang B A"} is regular.
*}
@@ -2040,7 +2043,7 @@
far. Using this theorem we can obviously prove when a language is \emph{not}
regular---by establishing that it has infinitely many equivalence classes
generated by the Myhill-Nerode relation (this is usually the purpose of the
- pumping lemma \cite{Kozen97}). We can also use it to establish the standard
+ Pumping Lemma \cite{Kozen97}). We can also use it to establish the standard
textbook results about closure properties of regular languages. Interesting
is the case of closure under complement, because it seems difficult to
construct a regular expression for the complement language by direct
@@ -2048,13 +2051,13 @@
proved using the Myhill-Nerode theorem.
Our insistence on regular expressions for proving the Myhill-Nerode theorem
- arose from the limitations of HOL on which the popular theorem provers HOL4,
+ arose from the limitations of HOL, on which the popular theorem provers HOL4,
HOLlight and Isabelle/HOL are based. In order to guarantee consistency,
- formalisations can only extend HOL by definitions that introduce a notion in
- terms of already existing concepts. A convenient definition for automata
+ formalisations can only extend HOL by definitions that introduce a new concept in
+ terms of already existing notions. A convenient definition for automata
(based on graphs) uses a polymorphic type for the state nodes. This allows
us to use the standard operation of disjoint union in order to compose two
- automata. Unfortunately, we cannot use such a polymorphic definition of
+ automata. Unfortunately, we cannot use such a polymorphic definition
in HOL as part of the definition for regularity of a language (a
set of strings). Consider the following attempt
@@ -2065,17 +2068,18 @@
\noindent
which means the definiens is polymorphic in the type of the automata @{text
"M"}, but the definiendum @{text "is_regular"} is not. Such definitions are
- excluded in HOL, because they lead easily to inconsistencies (see
+ excluded in HOL, because they can lead easily to inconsistencies (see
\cite{PittsHOL4} for a simple example). Also HOL does not contain
type-quantifiers which would allow us to get rid of the polymorphism by
quantifying over the type-variable @{text "\<alpha>"}. Therefore when defining
- regularity in terms of automata, the only natural way out in HOL is to use state
- nodes with an identity, for example a natural number. Unfortunatly, the
- consequence is that we have to be careful when combining two automata so
+ regularity in terms of automata, the only natural way out in HOL is to use
+ state nodes with an identity, for example a natural number. Unfortunatly,
+ the consequence is that we have to be careful when combining two automata so
that there is no clash between two such states. This makes formalisations
- quite fiddly and unpleasant. Regular expressions proved much more convenient
- for reasoning in HOL and we showed they can be used for establishing the
- Myhill-Nerode theorem.
+ quite fiddly and rather unpleasant. Regular expressions proved much more
+ convenient for reasoning in HOL and we showed they can be used for
+ establishing the central result in regular language theory: the Myhill-Nerode
+ theorem.
While regular expressions are convenient, they have some limitations. One is
that there seems to be no method of calculating a minimal regular expression
@@ -2086,32 +2090,6 @@
been formally established, we refer the reader to Owens and Slind
\cite{OwensSlind08}.
- Our formalisation consists of 780 lines of Isabelle/Isar code for the first
- direction and 460 for the second, plus around 300 lines of standard material
- about regular languages. The formalisation about derivatives and partial
- derivatives shown in Section~\ref{derivatives} consists of 390 lines of
- code. The algorithm for solving equational systems, which we used in the
- first direction, is conceptually not that complicated. Still the use of sets
- over which the algorithm operates, means it is not as easy to formalise as
- one might wish. It seems sets cannot be avoided since the `input' of the
- algorithm consists of equivalence classes and we cannot see how to
- reformulate the theory so that we can use lists, which are usually easier to
- reason about in a theorem prover.
-
- While our formalisation might be seen large, it should be seen
- in the context of the work done by Constable at al \cite{Constable00} who
- formalised the Myhill-Nerode theorem in Nuprl using automata. They write
- that their four-member team needed something on the magnitude of 18 months
- for their formalisation. The estimate for our formalisation is that we
- needed approximately 3 months and this included the time to find our proof
- arguments. Unlike Constable et al, who were able to follow the proofs from
- \cite{HopcroftUllman69}, we had to find our own arguments. So for us the
- formalisation was not the bottleneck. It is hard to gauge the size of a
- formalisation in Nurpl, but from what is shown in the Nuprl Math Library
- about their development it seems substantially larger than ours. The code of
- ours can be found in the Mercurial Repository at
- \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.
-
Our proof of the first direction is very much inspired by \emph{Brzozowski's
algebraic method} used to convert a finite automaton to a regular expression
\cite{Brzozowski64}. The close connection can be seen by considering the
@@ -2130,31 +2108,63 @@
We presented two proofs for the second direction of the Myhill-Nerode
theorem. One direct proof using tagging-functions and another using partial
- derivatives. These proofs is where our method shines, because we can
- completely side-step the standard argument \cite{Kozen97} where automata
- need to be composed. However, it is also the direction where we had to spend
- most of the `conceptual' time, as our first proof based on
- tagging-functions is new for establishing the Myhill-Nerode theorem. All
- standard proofs of this direction proceed by arguments over automata.
+ derivatives. This part of our work is where our method using regular
+ expressions shines, because we can completely side-step the standard
+ argument \cite{Kozen97} where automata need to be composed. However, it is
+ also the direction where we had to spend most of the `conceptual' time, as
+ our first proof based on tagging-functions is new for establishing the
+ Myhill-Nerode theorem. All standard proofs of this direction proceed by
+ arguments over automata.
- Our indirect proof for the second direction arose from the interest in
+ The indirect proof for the second direction arose from our interest in
Brzozowski's derivatives for regular expression matching. A corresponding
- regular expression matcher has been formalised in HOL4 in
- \cite{OwensSlind08}. In our opinion, this formalisation is considerably
+ regular expression matcher has been formalised by Owens and Slind in HOL4
+ \cite{OwensSlind08}. In our opinion, their formalisation is considerably
slicker than for example the approach to regular expression matching taken
- in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to
- simple regular expression matchers and he proved that there are only
+ in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to a
+ simple regular expression matcher and he established that there are only
finitely many dissimilar derivatives for every regular expression, this
result is not as straightforward to formalise in a theorem prover. The
reason is that the set of dissimilar derivatives is not defined inductively,
- but in terms of an ACI-equivalence relation.
-
-
+ but in terms of an ACI-equivalence relation. This difficulty prevented for
+ example Krauss and Nipkow to prove termination of their equivalence checker
+ for regular expressions \cite{KraussNipkow11}. Their checker is based on
+ derivatives and for their argument the lack of a formal proof of termination
+ is not crucial (it merely lets them ``sleep better'' \cite{KraussNipkow11}).
+ We expect that their development simplifies by using partial derivatives,
+ instead of derivatives, and that termination of the algorithm can be
+ formally established. However, since partial derivatives use sets of regular
+ expressions, one needs to carefully analyse whether the resulting algorithm
+ is still executable. Given the existing infrastructure for executable sets
+ in Isabelle/HOL, it should.
- \medskip
+ Our formalisation of the Myhill-Nerode theorem consists of 780 lines of
+ Isabelle/Isar code for the first direction and 460 for the second (the one
+ based on tagging functions), plus around 300 lines of standard material
+ about regular languages. The formalisation about derivatives and partial
+ derivatives shown in Section~\ref{derivatives} consists of 390 lines of
+ code. The algorithm for solving equational systems, which we used in the
+ first direction, is conceptually relatively simple. Still the use of sets
+ over which the algorithm operates means it is not as easy to formalise as
+ one might hope. However, it seems sets cannot be avoided since the `input'
+ of the algorithm consists of equivalence classes and we cannot see how to
+ reformulate the theory so that we can use lists. Lists would be much easier
+ to reason about, since we can define function over them by recursion. For
+ sets we have to use set-comprehensions.
- We expect that the development of Krauss \& Nipkow gets easier by
- using partial derivatives.\medskip
+ While our formalisation might be seen large, it should be seen
+ in the context of the work done by Constable at al \cite{Constable00} who
+ formalised the Myhill-Nerode theorem in Nuprl using automata. They write
+ that their four-member team needed something on the magnitude of 18 months
+ for their formalisation. The estimate for our formalisation is that we
+ needed approximately 3 months and this included the time to find our proof
+ arguments. Unlike Constable et al, who were able to follow the proofs from
+ \cite{HopcroftUllman69}, we had to find our own arguments. So for us the
+ formalisation was not the bottleneck. It is hard to gauge the size of a
+ formalisation in Nurpl, but from what is shown in the Nuprl Math Library
+ about their development it seems substantially larger than ours. The code of
+ ours can be found in the Mercurial Repository at
+ \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.\medskip
\noindent
{\bf Acknowledgements:}