--- a/Paper/Paper.thy	Sat Feb 19 22:05:22 2011 +0000
+++ b/Paper/Paper.thy	Sun Feb 20 06:02:58 2011 +0000
@@ -268,7 +268,7 @@
   string; this property states that if @{term "[] \<notin> A"} then the lengths of
   the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}.  We omit
   the proofs for these properties, but invite the reader to consult our
-  formalisation.\footnote{Available at \url{http://www4.in.tum.de/~urbanc/cgi-bin/repos.cgi/regexp/}}
+  formalisation.\footnote{Available at \url{http://www4.in.tum.de/~urbanc/regexp.html}}
 
   The notation in Isabelle/HOL for the quotient of a language @{text A} according to an 
   equivalence relation @{term REL} is @{term "A // REL"}. We will write 
@@ -280,7 +280,8 @@
   involving equivalence classes of languages. For this we will use Arden's lemma \cite{Brzozowski64},
   which solves equations of the form @{term "X = A ;; X \<union> B"} provided
   @{term "[] \<notin> A"}. However we will need the following `reverse' 
-  version of Arden's lemma (`reverse' in the sense of changing the order of @{text "\<cdot>"}).
+  version of Arden's lemma (`reverse' in the sense of changing the order of @{term "A ;; X"} to
+  \mbox{@{term "X ;; A"}}).
 
   \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
   If @{thm (prem 1) arden} then
@@ -377,7 +378,7 @@
   language. This can be defined as tertiary relation.
 
   \begin{definition}[Myhill-Nerode Relation] Given a language @{text A}, two strings @{text x} and
-  @{text y} are related provided
+  @{text y} are Myhill-Nerode related provided
   \begin{center}
   @{thm str_eq_def[simplified str_eq_rel_def Pair_Collect]}
   \end{center}
@@ -469,8 +470,8 @@
   the method by Brzozowski \cite{Brzozowski64}, where he marks the
   `terminal' states. We are forced to set up the equational system in our
   way, because the Myhill-Nerode relation determines the `direction' of the
-  transitions. The successor `state' of an equivalence class @{text Y} can
-  be reached by adding characters to the end of @{text Y}. This is also the
+  transitions---the successor `state' of an equivalence class @{text Y} can
+  be reached by adding a character to the end of @{text Y}. This is also the
   reason why we have to use our reverse version of Arden's lemma.}
   Overloading the function @{text \<calL>} for the two kinds of terms in the
   equational system, we have
@@ -711,7 +712,7 @@
   for which there is an equation. Therefore @{text lhss} is just the set containing 
   the first components of an equational system,
   while @{text "rhss"} collects all equivalence classes @{text X} in the terms of the 
-  form @{term "Trn X r"}. That means @{thm (lhs) lhss_def}~@{text "\<equiv> {X | (X, rhs) \<in> ES}"} 
+  form @{term "Trn X r"}. That means formally @{thm (lhs) lhss_def}~@{text "\<equiv> {X | (X, rhs) \<in> ES}"} 
   and @{thm (lhs) rhss_def}~@{text "\<equiv> {X | (X, r) \<in> rhs}"}.
   
 
@@ -804,7 +805,7 @@
   and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"}
   does not holds. By the stronger invariant we know there exists such a @{text "rhs"}
   with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality
-  of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the equation @{text "X = rhs"},
+  of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"},
   for which the invariant holds. This allows us to conclude that 
   @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"} and @{term "invariant {(X, rhs)}"} hold,
   as needed.\qed
@@ -824,7 +825,7 @@
   and that the invariant holds for this equation. That means we 
   know @{text "X = \<Union>\<calL> ` rhs"}. We further know that
   this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the 
-  invariant and Lem.\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
+  invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
   we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the arden operation
   removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
   This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
@@ -837,10 +838,10 @@
   of the Myhill-Nerode theorem.
 
   \begin{proof}[of Thm.~\ref{myhillnerodeone}]
-  By Lem.~\ref{every_eqcl_has_reg} we know that there exists a regular language for
+  By Lem.~\ref{every_eqcl_has_reg} we know that there exists a regular expression for
   every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is
   a subset of  @{term "UNIV // \<approx>A"}, we also know that for every equivalence class
-  in @{term "finals A"} there exists a regular language. Moreover by assumption 
+  in @{term "finals A"} there exists a regular expression. Moreover by assumption 
   we know that @{term "finals A"} must be finite, and therefore there must be a finite
   set of regular expressions @{text "rs"} such that
 
@@ -896,14 +897,14 @@
   \emph{tagging functions} defined over strings. Given the inductive hypothesis, it will 
   be easy to prove that the range of these tagging functions is finite
   (the range of a function @{text f} is defined as @{text "range f \<equiv> f ` UNIV"}).
-  With this we will be able to infer that the tagging functions, seen as a relation,
+  With this we will be able to infer that the tagging functions, seen as relations,
   give rise to finitely many equivalence classes of @{const UNIV}. Finally we 
   will show that the tagging relation is more refined than @{term "\<approx>(L r)"}, which
-  implies that @{term "UNIV // \<approx>(L r)"} must also be finite. 
-  A relation @{text "R\<^isub>1"} is said to \emph{refine} @{text "R\<^isub>2"} provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}.
-  We formally define the notion of a \emph{tag-relation} as follows.
+  implies that @{term "UNIV // \<approx>(L r)"} must also be finite (a relation @{text "R\<^isub>1"} 
+  is said to \emph{refine} @{text "R\<^isub>2"} provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}).
+  We formally define the notion of a \emph{tagging-relation} as follows.
 
-  \begin{definition}[Tagging-Relation] Given a tag-function @{text tag}, then two strings @{text x}
+  \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
   and @{text y} are \emph{tag-related} provided
   \begin{center}
   @{text "x =tag= y \<equiv> tag x = tag y"}
@@ -911,7 +912,7 @@
   \end{definition}
 
   \noindent
-  In order to establish finiteness of a set @{text A} we shall use the following powerful
+  In order to establish finiteness of a set @{text A}, we shall use the following powerful
   principle from Isabelle/HOL's library.
   %
   \begin{equation}\label{finiteimageD}
@@ -919,7 +920,7 @@
   \end{equation}
 
   \noindent
-  It states that if an image of a set under an injective function @{text f} (over this set) 
+  It states that if an image of a set under an injective function @{text f} (injective over this set) 
   is finite, then @{text A} itself must be finite. We can use it to establish the following 
   two lemmas.
 
@@ -929,24 +930,25 @@
 
   \begin{proof}
   We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have
-  @{text "range f"} to be subset of @{term "Pow (range tag)"}, which we know must be
+  @{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be
   finite by assumption. Now @{term "f (UNIV // =tag=)"} is a subset of @{text "range f"},
   and so also finite. Injectivity amounts to showing that @{text "X = Y"} under the
   assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}.
-  From the assumption we can obtain a @{text "x \<in> X"} and @{text "y \<in> Y"} with 
-  @{text "tag x = tag y"}. This in turn means that the equivalence classes @{text X}
+  From the assumption we can obtain @{text "x \<in> X"} and @{text "y \<in> Y"} with 
+  @{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in 
+  turn means that the equivalence classes @{text X}
   and @{text Y} must be equal.\qed
   \end{proof}
 
   \begin{lemma}\label{fintwo} 
-  Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, where
+  Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby
   @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}.
   If @{thm (prem 1) refined_partition_finite[where ?R1.0="R\<^isub>1" and ?R2.0="R\<^isub>2"]}
   then @{thm (concl) refined_partition_finite[where ?R1.0="R\<^isub>1" and ?R2.0="R\<^isub>2"]}.
   \end{lemma}
 
   \begin{proof}
-  We prove this lemma again using \eqref{finiteimageD}. For this we set @{text f} to
+  We prove this lemma again using \eqref{finiteimageD}. This time we set @{text f} to
   be @{text "X \<mapsto>"}~@{term "{R\<^isub>1 `` {x} | x. x \<in> X}"}. It is easy to see that 
   @{text "finite (f ` (UNIV // R\<^isub>2))"} because it is a subset of @{text "Pow (UNIV // R\<^isub>1)"},
   which is finite by assumption. What remains to be shown is that @{text f} is injective
@@ -956,8 +958,8 @@
   @{text "x \<in> X"} and @{text "y \<in> Y"} such that they are @{text R\<^isub>2} related.
   We know there exists a @{text x} with @{term "X = R\<^isub>2 `` {x}"}
   and @{text "x \<in> X"}. From the latter fact we can infer that @{term "R\<^isub>1 ``{x} \<in> f X"}
-  and further @{term "R\<^isub>1 ``{x} \<in> f Y"}. From this we can obtain a @{text y}
-  such that @{term "R\<^isub>1 `` {x} = R\<^isub>1 `` {y}"} holds. This means @{text x} and @{text y}
+  and further @{term "R\<^isub>1 ``{x} \<in> f Y"}. This means we can obtain a @{text y}
+  such that @{term "R\<^isub>1 `` {x} = R\<^isub>1 `` {y}"} holds. Consequently @{text x} and @{text y}
   are @{text "R\<^isub>1"}-related. Since by assumption @{text "R\<^isub>1"} refines @{text "R\<^isub>2"},
   they must also be @{text "R\<^isub>2"}-related, as we need to show.\qed
   \end{proof}
@@ -966,7 +968,7 @@
   Chaining Lem.~\ref{finone} and \ref{fintwo} together, means in order to show
   that @{term "UNIV // \<approx>(L r)"} is finite, we have to find a tagging function whose
   range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(L r)"}.
-  Let us attempt the @{const ALT}-case.
+  Let us attempt the @{const ALT}-case first.
 
   \begin{proof}[@{const "ALT"}-Case] 
   We take as tagging function 
@@ -979,7 +981,7 @@
   where @{text "A"} and @{text "B"} are some arbitrary languages.
   We can show in general, if @{term "finite (UNIV // \<approx>A)"} and @{term "finite (UNIV // \<approx>B)"}
   then @{term "finite ((UNIV // \<approx>A) \<times> (UNIV // \<approx>B))"} holds. The range of
-  @{term "tag_str_ALT A B"} is a subset of this product set. It remains to show
+  @{term "tag_str_ALT A B"} is a subset of this product set, so finite. It remains to be shown
   that @{text "=tag\<^isub>A\<^isub>L\<^isub>T A B="} refines @{term "\<approx>(A \<union> B)"}. This amounts to 
   showing
   %
@@ -996,15 +998,27 @@
 
   \noindent
   since both @{text "=tag\<^isub>A\<^isub>L\<^isub>T A B="} and @{term "\<approx>(A \<union> B)"} are symmetric. To solve
-  \eqref{pattern} we just have to unfold the tagging-relation and analyse 
-  in which set, @{text A} or @{text B}, the string @{term "x @ z"} is. Fianlly we
+  \eqref{pattern} we just have to unfold the definition of the tagging-relation and analyse 
+  in which set, @{text A} or @{text B}, the string @{term "x @ z"} is. 
+  The definition of the tagging-function will give us in each case the
+  information to infer that @{text "y @ z \<in> A \<union> B"}.
+  Finally we
   can discharge this case by setting @{text A} to @{term "L r\<^isub>1"} and @{text B} to @{term "L r\<^isub>2"}.\qed
   \end{proof}
 
 
   \noindent
   The pattern in \eqref{pattern} is repeated for the other two cases. Unfortunately,
-  they are slightly more complicated. 
+  they are slightly more complicated. In the @{const SEQ}-case we essentially have
+  to be able to infer that 
+
+  \begin{center}
+  @{term "x @ z \<in> A ;; B \<longrightarrow> y @ z \<in> A ;; B"}
+  \end{center}
+
+  \noindent
+  using the information given by the appropriate tagging function. The complication 
+  is to find out what the possible splits of @{text "x @ z"} are to be in @{term "A ;; B"}.
 
 
   \begin{proof}[@{const SEQ}-Case]
@@ -1059,14 +1073,17 @@
 
   While regular expressions are convenient in formalisations, they have some
   limitations. One is that there seems to be no method of calculating a
-  minimal regular expression (for example in terms of length), like there is
-  for automata. For an implementation of a simple regular expression matcher,
+  minimal regular expression (for example in terms of length) for a regular
+  language, like there is
+  for automata. On the other hand, efficient regular expression matching,
+  without using automata, poses no problem \cite{OwensReppyTuron09}.
+  For an implementation of a simple regular expression matcher,
   whose correctness has been formally established, we refer the reader to
   Owens and Slind \cite{OwensSlind08}.
 
 
   Our formalisation consists of 790 lines of Isabelle/Isar code for the first
-  direction and 475 for the second, plus 300 lines of standard material about
+  direction and 475 for the second, plus around 300 lines of standard material about
   regular languages. While this might be seen as too large to count as a
   concise proof pearl, this should be seen in the context of the work done by
   Constable at al \cite{Constable00} who formalised the Myhill-Nerode theorem
@@ -1082,7 +1099,7 @@
   Mercurial Repository at
 
   \begin{center}
-  \url{http://www4.in.tum.de/~urbanc/cgi-bin/repos.cgi/regexp/}
+  \url{http://www4.in.tum.de/~urbanc/regexp.html}
   \end{center}
 
 
@@ -1094,8 +1111,8 @@
   classes with the states of the automaton, then the most natural choice is to
   characterise each state with the set of strings starting from the initial
   state leading up to that state. Usually, however, the states are characterised as the
-  ones starting from that state leading to the terminal states.  The first
-  choice has consequences how the initial equational system is set up. We have
+  strings starting from that state leading to the terminal states.  The first
+  choice has consequences about how the initial equational system is set up. We have
   the $\lambda$-term on our `initial state', while Brzozowski has it on the
   terminal states. This means we also need to reverse the direction of Arden's
   lemma.
@@ -1106,17 +1123,18 @@
   expressions and shows that there can be only finitely many of them. We could
   have used as the tag of a string @{text s} the derivative of a regular expression
   generated with respect to @{text s}.  Using the fact that two strings are
-  Myhill-Nerode related whenever their derivative is the same together with
+  Myhill-Nerode related whenever their derivative is the same, together with
   the fact that there are only finitely many derivatives for a regular
   expression would give us a similar argument as ours. However it seems not so easy to
   calculate the derivatives and then to count them. Therefore we preferred our
-  direct method of using tagging-functions involving equivalence classes. This
+  direct method of using tagging-functions. This
   is also where our method shines, because we can completely side-step the
   standard argument \cite{Kozen97} where automata need to be composed, which
   as stated in the Introduction is not so convenient to formalise in a 
   HOL-based theorem prover. However, it is also the direction where we had to 
-  spend most of the `conceptual' time, as our proof-argument based on tagging functions
-  is new for establishing the Myhill-Nerode theorem.
+  spend most of the `conceptual' time, as our proof-argument based on tagging-functions
+  is new for establishing the Myhill-Nerode theorem. All standard proofs 
+  of this direction proceed by arguments over automata.
 
 *}
changeset 123	23c0e6f2929d
parent 122	ab6637008963
child 124	8233510cab6c