regexp: comparison Journal/Paper.thy

equal deleted inserted replaced

-:cf1c17431dab
+:b300f2c5d51d
 Cons ("_ :: _" [100, 100] 100) and
 Rev ("Rev _" [1000] 100) and
 Der ("Der _ _" [1000, 1000] 100) and
 ONE ("ONE" 999) and
 ZERO ("ZERO" 999) and
-pders_lang ("pderl") and
+pders_lang ("pdersl") and
 UNIV1 ("UNIV\<^isup>+") and
-Ders_lang ("Derl")
+Ders_lang ("Dersl")
 lemma meta_eq_app:
 shows "f \<equiv> \<lambda>x. g x \<Longrightarrow> f x \<equiv> g x"
 by auto
 limitations that hurt badly, if one attempts a simple-minded formalisation
 of regular languages in it.
 The typical approach to regular languages is to
 introduce finite automata and then define everything in terms of them
-\cite{Kozen97}.  For example, a regular language is normally defined as:
+\cite{ HopcroftUllman69,Kozen97}.  For example, a regular language is
+normally defined as:
 \begin{dfntn}\label{baddef}
 A language @{text A} is \emph{regular}, provided there is a
 finite deterministic automaton that recognises all strings of @{text "A"}.
 \end{dfntn}
 \end{equation}
 \noindent
 changes the type---the disjoint union is not a set, but a set of
 pairs. Using this definition for disjoint union means we do not have a
-single type for automata. As a result we will not be able to define a
+single type for the states of automata. As a result we will not be able to define a
 regular language as one for which there exists an automaton that recognises
 all its strings (Definition~\ref{baddef}). This is because we cannot make a
-definition in HOL that is polymorphic in the state type and there is no type
+definition in HOL that is only polymorphic in the state type and there is no type
 quantification available in HOL (unlike in Coq, for example).\footnote{Slind
 already pointed out this problem in an email to the HOL4 mailing list on
 21st April 2005.}
+An alternative, which provides us with a single type for states of automata,
-An alternative, which provides us with a single type for automata, is to give every
+is to give every state node an identity, for example a natural number, and
-state node an identity, for example a natural
+then be careful to rename these identities apart whenever connecting two
-number, and then be careful to rename these identities apart whenever
+automata. This results in clunky proofs establishing that properties are
-connecting two automata. This results in clunky proofs
+invariant under renaming. Similarly, connecting two automata represented as
-establishing that properties are invariant under renaming. Similarly,
+matrices results in very adhoc constructions, which are not pleasant to
-connecting two automata represented as matrices results in very adhoc
+reason about.
-constructions, which are not pleasant to reason about.
 Functions are much better supported in Isabelle/HOL, but they still lead to similar
 problems as with graphs.  Composing, for example, two non-deterministic automata in parallel
 requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98}
 dismisses for this the option of using identities, because it leads according to
 "\<lbrakk>x\<rbrakk>\<^isub>\<approx> = \<lbrakk>y\<rbrakk>\<^isub>\<approx>"}.
 Central to our proof will be the solution of equational systems
 involving equivalence classes of languages. For this we will use Arden's Lemma
-(see \cite[Page 100]{Sakarovitch09}),
+(see for example \cite[Page 100]{Sakarovitch09}),
 which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
 @{term "[] \<notin> A"}. However we will need the following `reverse'
 version of Arden's Lemma (`reverse' in the sense of changing the order of @{term "A \<cdot> X"} to
 \mbox{@{term "X \<cdot> A"}}).
 @{thm (rhs) arden}.
 \end{lmm}
 \begin{proof}
 For the right-to-left direction we assume @{thm (rhs) arden} and show
-that @{thm (lhs) arden} holds. From Prop.~\ref{langprops}@{text "(i)"}
+that @{thm (lhs) arden} holds. From Property~\ref{langprops}@{text "(i)"}
 we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"},
 which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both
 sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side
 is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. This completes this direction.
 Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for
 all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition
 of @{text "\<star>"}.
 For the inclusion in the other direction we assume a string @{text s}
 with length @{text k} is an element in @{text X}. Since @{thm (prem 1) arden}
-we know by Prop.~\ref{langprops}@{text "(ii)"} that
+we know by Property~\ref{langprops}@{text "(ii)"} that
 @{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k}
 (the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer).
 From @{text "(*)"} it follows then that
 @{term s} must be an element in @{term "(\<Union>m\<in>{0..k}. B \<cdot> (A \<up> m))"}. This in turn
-implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Prop.~\ref{langprops}@{text "(iii)"}
+implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Property~\ref{langprops}@{text "(iii)"}
 this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show.
 \end{proof}
 \noindent
 Regular expressions are defined as the inductive datatype
 equivalence class in \mbox{@{term "finals A"}} (which by assumption must be
 a finite set), then we can use @{text "\<bigplus>"} to obtain a regular expression
 that matches every string in @{text A}.
-Our proof of Thm.~\ref{myhillnerodeone} relies on a method that can calculate a
+Our proof of Theorem~\ref{myhillnerodeone} relies on a method that can calculate a
 regular expression for \emph{every} equivalence class, not just the ones
 in @{term "finals A"}. We
 first define the notion of \emph{one-character-transition} between
 two equivalence classes
 %
 \begin{lmm}\label{inv}
 If @{thm (prem 1) test} then @{text "X = \<Union> \<calL> ` rhs"}.
 \end{lmm}
 \noindent
-Our proof of Thm.~\ref{myhillnerodeone} will proceed by transforming the
+Our proof of Theorem~\ref{myhillnerodeone} will proceed by transforming the
 initial equational system into one in \emph{solved form} maintaining the invariant
-in Lem.~\ref{inv}. From the solved form we will be able to read
+in Lemma~\ref{inv}. From the solved form we will be able to read
 off the regular expressions.
 In order to transform an equational system into solved form, we have two
 operations: one that takes an equation of the form @{text "X = rhs"} and removes
 any recursive occurrences of @{text X} in the @{text rhs} using our variant of Arden's
 \begin{center}
 @{thm Solve_def}
 \end{center}
 \noindent
-We are not concerned here with the definition of this operator
+We are not concerned here with the definition of this operator (see
-(see Berghofer and Nipkow \cite{BerghoferNipkow00}), but note that we eliminate
+Berghofer and Nipkow \cite{BerghoferNipkow00} for example), but note that we
-in each @{const Iter}-step a single equation, and therefore
+eliminate in each @{const Iter}-step a single equation, and therefore have a
-have a well-founded termination order by taking the cardinality
+well-founded termination order by taking the cardinality of the equational
-of the equational system @{text ES}. This enables us to prove
+system @{text ES}. This enables us to prove properties about our definition
-properties about our definition of @{const Solve} when we `call' it with
+of @{const Solve} when we `call' it with the equivalence class @{text X} and
-the equivalence class @{text X} and the initial equational system
+the initial equational system @{term "Init (UNIV // \<approx>A)"} from
-@{term "Init (UNIV // \<approx>A)"} from
 \eqref{initcs} using the principle:
-%
 \begin{equation}\label{whileprinciple}
 \mbox{\begin{tabular}{l}
 @{term "invariant (Init (UNIV // \<approx>A))"} \\
 @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> invariant (Iter X ES)"}\\
 @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> card (Iter X ES) < card ES"}\\
 @{thm[mode=IfThen] Init_ES_satisfies_invariant}
 \end{lmm}
 \begin{proof}
 Finiteness is given by the assumption and the way how we set up the
-initial equational system. Soundness is proved in Lem.~\ref{inv}. Distinctness
+initial equational system. Soundness is proved in Lemma~\ref{inv}. Distinctness
 follows from the fact that the equivalence classes are disjoint. The @{text ardenable}
 property also follows from the setup of the initial equational system, as does
 validity.
 \end{proof}
 \end{center}
 \noindent
 Finiteness is straightforward, as the @{const Subst} and @{const Arden} operations
 keep the equational system finite. These operations also preserve soundness
-and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
+and distinctness (we proved soundness for @{const Arden} in Lemma~\ref{ardenable}).
 The property @{text ardenable} is clearly preserved because the append-operation
 cannot make a regular expression to match the empty string. Validity is
 given because @{const Arden} removes an equivalence class from @{text yrhs}
 and then @{const Subst_all} removes @{text Y} from the equational system.
 Having proved the implication above, we can instantiate @{text "ES"} with @{text "ES - {(Y, yrhs)}"}
 and @{term "invariant {(X, rhs)}"}.
 \end{lmm}
 \begin{proof}
 In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly
-stronger invariant since Lem.~\ref{iterone} and \ref{itertwo} have the precondition
+stronger invariant since Lemma~\ref{iterone} and \ref{itertwo} have the precondition
 that @{term "(X, rhs) \<in> ES"} for some @{text rhs}. This precondition is needed
 in order to choose in the @{const Iter}-step an equation that is not \mbox{@{term "X = rhs"}}.
 Therefore our invariant cannot be just @{term "invariant ES"}, but must be
 @{term "invariant ES \<and> (\<exists>rhs. (X, rhs) \<in> ES)"}. By assumption
-@{thm (prem 2) Solve} and Lem.~\ref{invzero}, the more general invariant holds for
+@{thm (prem 2) Solve} and Lemma~\ref{invzero}, the more general invariant holds for
 the initial equational system. This is premise 1 of~\eqref{whileprinciple}.
-Premise 2 is given by Lem.~\ref{iterone} and the fact that @{const Iter} might
+Premise 2 is given by Lemma~\ref{iterone} and the fact that @{const Iter} might
 modify the @{text rhs} in the equation @{term "X = rhs"}, but does not remove it.
-Premise 3 of~\eqref{whileprinciple} is by Lem.~\ref{itertwo}. Now in premise 4
+Premise 3 of~\eqref{whileprinciple} is by Lemma~\ref{itertwo}. Now in premise 4
 we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"}
 and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"}
 does not holds. By the stronger invariant we know there exists such a @{text "rhs"}
 with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality
 of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"},
 By the preceding lemma, we know that there exists a @{text "rhs"} such
 that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"},
 and that the invariant holds for this equation. That means we
 know @{text "X = \<Union>\<calL> ` rhs"}. We further know that
 this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the
-invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
+invariant and Lemma~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
 we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation
 removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
 This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
 So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = lang (\<Uplus>rs)"}.
 With this we can conclude the proof.
 \end{proof}
 \noindent
-Lem.~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
+Lemma~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
 of the Myhill-Nerode theorem.
-\begin{proof}[of Thm.~\ref{myhillnerodeone}]
+\begin{proof}[Proof of Theorem~\ref{myhillnerodeone}]
-By Lem.~\ref{every_eqcl_has_reg} we know that there exists a regular expression for
+By Lemma~\ref{every_eqcl_has_reg} we know that there exists a regular expression for
 every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is
 a subset of  @{term "UNIV // \<approx>A"}, we also know that for every equivalence class
 in @{term "finals A"} there exists a regular expression. Moreover by assumption
 we know that @{term "finals A"} must be finite, and therefore there must be a finite
 set of regular expressions @{text "rs"} such that
 A relation @{text "R\<^isub>1"} is said to \emph{refine} @{text "R\<^isub>2"}
 provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}.
 \end{dfntn}
 \noindent
-For constructing @{text R} will
+For constructing @{text R}, will rely on some \emph{tagging-functions}
-rely on some \emph{tagging-functions} defined over strings. Given the
+defined over strings. Given the inductive hypothesis, it will be easy to
-inductive hypothesis, it will be easy to prove that the \emph{range} of
+prove that the \emph{range} of these tagging-functions is finite. The range
-these tagging-functions is finite. The range of a function @{text f} is
+of a function @{text f} is defined as
-defined as
 \begin{center}
 @{text "range f \<equiv> f ` UNIV"}
 \end{center}
 We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have
 @{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be
 finite by assumption. Now @{term "f (UNIV // =tag=)"} is a subset of @{text "range f"},
 and so also finite. Injectivity amounts to showing that @{text "X = Y"} under the
 assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}.
-From the assumptions we can obtain @{text "x \<in> X"} and @{text "y \<in> Y"} with
+From the assumptions we obtain \mbox{@{text "x \<in> X"}} and @{text "y \<in> Y"} with
 @{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in
 turn means that the equivalence classes @{text X}
-and @{text Y} must be equal.
+and @{text Y} must be equal. Therefore \eqref{finiteimageD} allows us to conclude
+with @{thm (concl) finite_eq_tag_rel}.
 \end{proof}
 \begin{lmm}\label{fintwo}
 Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby
 @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}.
 are @{text "R\<^isub>1"}-related. Since by assumption @{text "R\<^isub>1"} refines @{text "R\<^isub>2"},
 they must also be @{text "R\<^isub>2"}-related, as we need to show.
 \end{proof}
 \noindent
-Chaining Lem.~\ref{finone} and \ref{fintwo} together, means in order to show
+Chaining Lemma~\ref{finone} and \ref{fintwo} together, means in order to show
 that @{term "UNIV // \<approx>(lang r)"} is finite, we have to construct a tagging-function whose
 range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(lang r)"}.
 Let us attempt the @{const PLUS}-case first. We take as tagging-function
 \begin{center}
 If we know that @{text "(x\<^isub>p, x\<^isub>s) \<in> Partitions x"}, we will
 refer to @{text "x\<^isub>p"} as the \emph{prefix} of the string @{text x},
 and respectively to @{text "x\<^isub>s"} as the \emph{suffix}.
-Now assuming  @{term "x @ z \<in> A \<cdot> B"} there are only two possible ways of how to `split'
+Now assuming  @{term "x @ z \<in> A \<cdot> B"}, there are only two possible ways of how to `split'
 this string to be in @{term "A \<cdot> B"}:
 %
 \begin{center}
 \begin{tabular}{c}
 \scalebox{1}{
 @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\<equiv>"}~
 @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"}
 \end{center}
 \noindent
-We have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do
+Note that we have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do
 not know anything about how the string @{term x} is partitioned.
 With this definition in place, let us prove the @{const "Times"}-case.
 \begin{proof}[@{const TIMES}-Case]
 @{text "x\<^isub>p < x"} and the rest @{term "x\<^isub>s @ z \<in> A\<star>"}. For example the empty string
 @{text "[]"} would do (recall @{term "x \<noteq> []"}).
 There are potentially many such prefixes, but there can only be finitely many of them (the
 string @{text x} is finite). Let us therefore choose the longest one and call it
 @{text "x\<^bsub>pmax\<^esub>"}. Now for the rest of the string @{text "x\<^isub>s @ z"} we
-know it is in @{term "A\<star>"} and cannot be the empty string. By Prop.~\ref{langprops}@{text "(iv)"},
+know it is in @{term "A\<star>"} and cannot be the empty string. By Property~\ref{langprops}@{text "(iv)"},
 we can separate
 this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \<noteq> []"}, @{text "a \<in> A"}
 and @{term "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x\<^isub>s"},
 otherwise @{text "x\<^bsub>pmax\<^esub>"} is not the longest prefix. That means @{text a}
 `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and
 %   & @{thm (rhs) Ders_simps(3)[where ?s1.0="s\<^isub>1" and ?s2.0="s\<^isub>2"]}\\
 \end{tabular}}
 \end{equation}
 \noindent
-where @{text "\<Delta>"} in the fifth line is a function that tests whether the empty string
+where @{text "\<Delta>"} in the fifth line is a function that tests whether the
-is in the language and returns @{term "{[]}"} or @{term "{}"}, accordingly.
+empty string is in the language and returns @{term "{[]}"} or @{term "{}"},
-The only interesting case is the last one where we use Prop.~\ref{langprops}@{text "(i)"}
+accordingly.  In the last equation we use the list-cons operator written
-in order to infer that @{term "Der c (A\<star>) = Der c (A \<cdot> A\<star>)"}. We can
+\mbox{@{text "_ :: _"}}.  The only interesting case is the @{text "A\<star>"}-case
-then complete the proof by noting that @{term "Delta A \<cdot> Der c (A\<star>) \<subseteq> (Der c A) \<cdot> A\<star>"}.
+where we use Property~\ref{langprops}@{text "(i)"} in order to infer that
+@{term "Der c (A\<star>) = Der c (A \<cdot> A\<star>)"}. We can then complete the proof by
-Brzozowski observed that the left-quotients for languages of regular
+using the fifth equation and noting that @{term "Delta A \<cdot> Der c (A\<star>) \<subseteq> (Der
-expressions can be calculated directly using the notion of \emph{derivatives
+c A) \<cdot> A\<star>"}.
-of a regular expression} \cite{Brzozowski64}. We define this notion in
-Isabelle/HOL as follows:
+Brzozowski observed that the left-quotients for languages of
+regular expressions can be calculated directly using the notion of
+\emph{derivatives of a regular expression} \cite{Brzozowski64}. We define
+this notion in Isabelle/HOL as follows:
 \begin{center}
 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
 @{thm (lhs) der.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) der.simps(1)}\\
 @{thm (lhs) der.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) der.simps(2)}\\
 @{thm (lhs) ders.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) ders.simps(2)}\\
 \end{tabular}
 \end{center}
 \noindent
-The last two clauses extend derivatives from characters to strings---i.e.~list of
+The last two clauses extend derivatives from characters to strings. The
-characters. The list-cons operator is written \mbox{@{text "_ :: _"}}. The
 boolean function @{term "nullable r"} needed in the @{const Times}-case tests
 whether a regular expression can recognise the empty string. It can be defined as
 follows.
 \begin{center}
 \end{center}
 \noindent
 By induction on the regular expression @{text r}, respectively on the string @{text s},
 one can easily show that left-quotients and derivatives of regular expressions
-relate as follows (for example \cite{Sakarovitch09}):
+relate as follows (see for example~\cite{Sakarovitch09}):
 \begin{equation}\label{Dersders}
 \mbox{\begin{tabular}{c}
 @{thm Der_der}\\
 @{thm Ders_ders}
 corresponding language there are only finitely many derivatives---thus
 ensuring that there are only finitely many equivalence
 classes. Unfortunately, this is not true in general. Sakarovitch gives an
 example where a regular expression has infinitely many derivatives
 w.r.t.~the language \mbox{@{term "({a} \<cdot> {b})\<star> \<union> ({a} \<cdot> {b})\<star> \<cdot> {a}"}}
-\cite[Page~141]{Sakarovitch09}.
+(see \cite[Page~141]{Sakarovitch09}).
 What Brzozowski \cite{Brzozowski64} established is that for every language there
 \emph{are} only finitely `dissimilar' derivatives of a regular
 expression. Two regular expressions are said to be \emph{similar} provided
 Partial derivatives can be seen as having the @{text "ACI"}-identities already built in:
 taking the partial derivatives of the
 regular expressions in \eqref{ACI} gives us in each case
 equal sets.  Antimirov \cite{Antimirov95} showed a similar result to
-\eqref{Dersders} for partial derivatives:
+\eqref{Dersders} for partial derivatives, namely
 \begin{equation}\label{Derspders}
 \mbox{\begin{tabular}{lc}
 @{text "(i)"}  & @{thm Der_pder}\\
 @{text "(ii)"} & @{thm Ders_pders}
 \noindent
 Antimirov's argument first shows this theorem for the language @{term UNIV1},
 which is the set of all non-empty strings. For this he proves:
-\begin{equation}
+\begin{equation}\label{pdersunivone}
 \mbox{\begin{tabular}{l}
 @{thm pders_lang_Zero}\\
 @{thm pders_lang_One}\\
 @{thm pders_lang_Atom}\\
 @{thm pders_lang_Plus[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
 \end{center}
 \noindent
 and for all languages @{text "A"}, @{thm pders_lang_subset[where B="UNIV",
 simplified]} holds.  Since we follow Antimirov's proof quite closely in our
-formalisation, we omit the details.
+formalisation (only the last two cases of \eqref{pdersunivone} involve some
+non-routine induction argument), we omit the details.
-Let us return to our proof of the second direction in the Myhill-Nerode
+Let us now return to our proof about the second direction in the Myhill-Nerode
 theorem. The point of the above calculations is to use
 @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation.
 \begin{proof}[Proof of Theorem~\ref{myhillnerodetwo} (second version)]
 \noindent
 Now the range of @{term "\<lambda>x. pders x r"} is a subset of @{term "Pow (pders_lang UNIV r)"},
 which we know is finite by Theorem~\ref{antimirov}. This means there
 are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"},
-and we can again conclude the second part of the Myhill-Nerode theorem.
+which refines @{term "\<approx>(lang r)"}, and consequently we can again conclude the
+second part of the Myhill-Nerode theorem.
 \end{proof}
 *}
 section {* Closure Properties of Regular Languages *}
 operations. Closure under union, concatenation and Kleene-star are trivial
 to establish given our definition of regularity (recall Definition~\ref{regular}).
 More interesting is the closure under complement, because it seems difficult
 to construct a regular expression for the complement language by direct
 means. However the existence of such a regular expression can now be easily
-proved using the Myhill-Nerode theorem since
+proved using both parts of the Myhill-Nerode theorem, since
 \begin{center}
 @{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"}
 \end{center}
 \noindent
 holds for any strings @{text "s\<^isub>1"} and @{text
 "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"}
 give rise to the same partitions. So if one is finite, the other is too, and
-the other way around. Proving the existence of such a regular expression via
+vice versa. Proving the existence of such a regular expression via
 automata using the standard method would be quite involved. It includes the
 steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
 "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
 regular expression. Clearly not something you want to formalise in a theorem
 prover in which it is cumbersome to reason about automata.
 "Ders_lang B A"} is regular. To see this consider the following argument
 using partial derivatives: From @{text A} being regular we know there exists
 a regular expression @{text r} such that @{term "A = lang r"}. We also know
 that @{term "pders_lang B r"} is finite for every language @{text B} and
 regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition
-and Lemma~\ref{Derspders} therefore
+and \eqref{Derspders} therefore
 \begin{equation}\label{eqq}
 @{term "Ders_lang B (lang r) = (\<Union> lang ` (pders_lang B r))"}
 \end{equation}
 \noindent
 Since there are only finitely many regular expressions in @{term "pders_lang
-B r"}, we know by \eqref{uplus} that there exists a regular expression that
+B r"}, we know by \eqref{uplus} that there exists a regular expression so that
-the right-hand side of \eqref{eqq} is equal to \mbox{@{term "lang (\<Uplus>(pders_lang B
+the right-hand side of \eqref{eqq} is equal to the language \mbox{@{term "lang (\<Uplus>(pders_lang B
 r))"}}. Thus the regular expression @{term "\<Uplus>(pders_lang B r)"} verifies that
 @{term "Ders_lang B A"} is regular.
 *}
 \noindent
 Having formalised this theorem means we pushed our point of view quite
 far. Using this theorem we can obviously prove when a language is \emph{not}
 regular---by establishing that it has infinitely many equivalence classes
 generated by the Myhill-Nerode relation (this is usually the purpose of the
-pumping lemma \cite{Kozen97}).  We can also use it to establish the standard
+Pumping Lemma \cite{Kozen97}).  We can also use it to establish the standard
 textbook results about closure properties of regular languages. Interesting
 is the case of closure under complement, because it seems difficult to
 construct a regular expression for the complement language by direct
 means. However the existence of such a regular expression can be easily
 proved using the Myhill-Nerode theorem.
 Our insistence on regular expressions for proving the Myhill-Nerode theorem
-arose from the limitations of HOL on which the popular theorem provers HOL4,
+arose from the limitations of HOL, on which the popular theorem provers HOL4,
 HOLlight and Isabelle/HOL are based. In order to guarantee consistency,
-formalisations can only extend HOL by definitions that introduce a notion in
+formalisations can only extend HOL by definitions that introduce a new concept in
-terms of already existing concepts. A convenient definition for automata
+terms of already existing notions. A convenient definition for automata
 (based on graphs) uses a polymorphic type for the state nodes. This allows
 us to use the standard operation of disjoint union in order to compose two
-automata. Unfortunately, we cannot use such a polymorphic definition of
+automata. Unfortunately, we cannot use such a polymorphic definition
 in HOL as part of the definition for regularity of a language (a
 set of strings).  Consider the following attempt
 \begin{center}
 @{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_finite_automata (M) \<and> \<calL>(M) = A"}
 \end{center}
 \noindent
 which means the definiens is polymorphic in the type of the automata @{text
 "M"}, but the definiendum @{text "is_regular"} is not. Such definitions are
-excluded in HOL, because they lead easily to inconsistencies (see
+excluded in HOL, because they can lead easily to inconsistencies (see
 \cite{PittsHOL4} for a simple example). Also HOL does not contain
 type-quantifiers which would allow us to get rid of the polymorphism by
 quantifying over the type-variable @{text "\<alpha>"}. Therefore when defining
-regularity in terms of automata, the only natural way out in HOL is to use state
+regularity in terms of automata, the only natural way out in HOL is to use
-nodes with an identity, for example a natural number. Unfortunatly, the
+state nodes with an identity, for example a natural number. Unfortunatly,
-consequence is that we have to be careful when combining two automata so
+the consequence is that we have to be careful when combining two automata so
 that there is no clash between two such states. This makes formalisations
-quite fiddly and unpleasant. Regular expressions proved much more convenient
+quite fiddly and rather unpleasant. Regular expressions proved much more
-for reasoning in HOL and we showed they can be used for establishing the
+convenient for reasoning in HOL and we showed they can be used for
-Myhill-Nerode theorem.
+establishing the central result in regular language theory: the Myhill-Nerode
+theorem.
 While regular expressions are convenient, they have some limitations. One is
 that there seems to be no method of calculating a minimal regular expression
 (for example in terms of length) for a regular language, like there is for
 automata. On the other hand, efficient regular expression matching, without
 using automata, poses no problem \cite{OwensReppyTuron09}.  For an
 implementation of a simple regular expression matcher, whose correctness has
 been formally established, we refer the reader to Owens and Slind
 \cite{OwensSlind08}.
-Our formalisation consists of 780 lines of Isabelle/Isar code for the first
-direction and 460 for the second, plus around 300 lines of standard material
-about regular languages. The formalisation about derivatives and partial
-derivatives shown in Section~\ref{derivatives} consists of 390 lines of
-code.  The algorithm for solving equational systems, which we used in the
-first direction, is conceptually not that complicated. Still the use of sets
-over which the algorithm operates, means it is not as easy to formalise as
-one might wish. It seems sets cannot be avoided since the `input' of the
-algorithm consists of equivalence classes and we cannot see how to
-reformulate the theory so that we can use lists, which are usually easier to
-reason about in a theorem prover.
-While our formalisation might be seen large, it should be seen
-in the context of the work done by Constable at al \cite{Constable00} who
-formalised the Myhill-Nerode theorem in Nuprl using automata. They write
-that their four-member team needed something on the magnitude of 18 months
-for their formalisation. The estimate for our formalisation is that we
-needed approximately 3 months and this included the time to find our proof
-arguments. Unlike Constable et al, who were able to follow the proofs from
-\cite{HopcroftUllman69}, we had to find our own arguments.  So for us the
-formalisation was not the bottleneck. It is hard to gauge the size of a
-formalisation in Nurpl, but from what is shown in the Nuprl Math Library
-about their development it seems substantially larger than ours. The code of
-ours can be found in the Mercurial Repository at
-\mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.
 Our proof of the first direction is very much inspired by \emph{Brzozowski's
 algebraic method} used to convert a finite automaton to a regular expression
 \cite{Brzozowski64}. The close connection can be seen by considering the
 equivalence classes as the states of the minimal automaton for the regular
 in the literature about this way of proving the first direction of the
 Myhill-Nerode theorem, but it appears to be folklore.
 We presented two proofs for the second direction of the Myhill-Nerode
 theorem. One direct proof using tagging-functions and another using partial
-derivatives. These proofs is where our method shines, because we can
+derivatives. This part of our work is where our method using regular
-completely side-step the standard argument \cite{Kozen97} where automata
+expressions shines, because we can completely side-step the standard
-need to be composed. However, it is also the direction where we had to spend
+argument \cite{Kozen97} where automata need to be composed. However, it is
-most of the `conceptual' time, as our first proof based on
+also the direction where we had to spend most of the `conceptual' time, as
-tagging-functions is new for establishing the Myhill-Nerode theorem. All
+our first proof based on tagging-functions is new for establishing the
-standard proofs of this direction proceed by arguments over automata.
+Myhill-Nerode theorem. All standard proofs of this direction proceed by
+arguments over automata.
-Our indirect proof for the second direction arose from the interest in
+The indirect proof for the second direction arose from our interest in
 Brzozowski's derivatives for regular expression matching. A corresponding
-regular expression matcher has been formalised in HOL4 in
+regular expression matcher has been formalised by Owens and Slind in HOL4
-\cite{OwensSlind08}. In our opinion, this formalisation is considerably
+\cite{OwensSlind08}. In our opinion, their formalisation is considerably
 slicker than for example the approach to regular expression matching taken
-in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to
+in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to a
-simple regular expression matchers and he proved that there are only
+simple regular expression matcher and he established that there are only
 finitely many dissimilar derivatives for every regular expression, this
 result is not as straightforward to formalise in a theorem prover. The
 reason is that the set of dissimilar derivatives is not defined inductively,
-but in terms of an ACI-equivalence relation.
+but in terms of an ACI-equivalence relation. This difficulty prevented for
+example Krauss and Nipkow to prove termination of their equivalence checker
+for regular expressions \cite{KraussNipkow11}. Their checker is based on
+derivatives and for their argument the lack of a formal proof of termination
-\medskip
+is not crucial (it merely lets them ``sleep better'' \cite{KraussNipkow11}).
+We expect that their development simplifies by using partial derivatives,
-We expect that the development of Krauss \& Nipkow gets easier by
+instead of derivatives, and that termination of the algorithm can be
-using partial derivatives.\medskip
+formally established. However, since partial derivatives use sets of regular
+expressions, one needs to carefully analyse whether the resulting algorithm
+is still executable. Given the existing infrastructure for executable sets
+in Isabelle/HOL, it should.
+Our formalisation of the Myhill-Nerode theorem consists of 780 lines of
+Isabelle/Isar code for the first direction and 460 for the second (the one
+based on tagging functions), plus around 300 lines of standard material
+about regular languages. The formalisation about derivatives and partial
+derivatives shown in Section~\ref{derivatives} consists of 390 lines of
+code.  The algorithm for solving equational systems, which we used in the
+first direction, is conceptually relatively simple. Still the use of sets
+over which the algorithm operates means it is not as easy to formalise as
+one might hope. However, it seems sets cannot be avoided since the `input'
+of the algorithm consists of equivalence classes and we cannot see how to
+reformulate the theory so that we can use lists. Lists would be much easier
+to reason about, since we can define function over them by recursion. For
+sets we have to use set-comprehensions.
+While our formalisation might be seen large, it should be seen
+in the context of the work done by Constable at al \cite{Constable00} who
+formalised the Myhill-Nerode theorem in Nuprl using automata. They write
+that their four-member team needed something on the magnitude of 18 months
+for their formalisation. The estimate for our formalisation is that we
+needed approximately 3 months and this included the time to find our proof
+arguments. Unlike Constable et al, who were able to follow the proofs from
+\cite{HopcroftUllman69}, we had to find our own arguments.  So for us the
+formalisation was not the bottleneck. It is hard to gauge the size of a
+formalisation in Nurpl, but from what is shown in the Nuprl Math Library
+about their development it seems substantially larger than ours. The code of
+ours can be found in the Mercurial Repository at
+\mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.\medskip
 \noindent
 {\bf Acknowledgements:}
 We are grateful for the comments we received from Larry
 Paulson.

changeset 198	b300f2c5d51d
parent 197	cf1c17431dab
child 199	11c3c302fa2e