Paper/Paper.thy
changeset 156 fd39492b187c
parent 154 7c68b9ad4486
child 157 10d2d0cbe381
--- a/Paper/Paper.thy	Tue Apr 19 02:25:21 2011 +0000
+++ b/Paper/Paper.thy	Thu Apr 21 12:07:11 2011 +0000
@@ -265,7 +265,7 @@
 
   \noindent
   In @{text "(ii)"} we use the notation @{term "length s"} for the length of a
-  string; this property states that if @{term "[] \<notin> A"} then the lengths of
+  string; this property states that if \mbox{@{term "[] \<notin> A"}} then the lengths of
   the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}.  We omit
   the proofs for these properties, but invite the reader to consult our
   formalisation.\footnote{Available at \url{http://www4.in.tum.de/~urbanc/regexp.html}}
@@ -273,14 +273,14 @@
   The notation in Isabelle/HOL for the quotient of a language @{text A} according to an 
   equivalence relation @{term REL} is @{term "A // REL"}. We will write 
   @{text "\<lbrakk>x\<rbrakk>\<^isub>\<approx>"} for the equivalence class defined 
-  as @{text "{y | y \<approx> x}"}.
+  as \mbox{@{text "{y | y \<approx> x}"}}.
 
 
   Central to our proof will be the solution of equational systems
-  involving equivalence classes of languages. For this we will use Arden's lemma \cite{Brzozowski64},
+  involving equivalence classes of languages. For this we will use Arden's Lemma \cite{Brzozowski64},
   which solves equations of the form @{term "X = A ;; X \<union> B"} provided
   @{term "[] \<notin> A"}. However we will need the following `reverse' 
-  version of Arden's lemma (`reverse' in the sense of changing the order of @{term "A ;; X"} to
+  version of Arden's Lemma (`reverse' in the sense of changing the order of @{term "A ;; X"} to
   \mbox{@{term "X ;; A"}}).
 
   \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
@@ -446,8 +446,9 @@
   @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<Rightarrow> X\<^isub>3"} with @{text d} being any 
   other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>d\<Rightarrow> X\<^isub>3"} for any @{text d}.
   
-  Next we build an \emph{initial equational system} that
-  contains an equation for each equivalence class. Suppose we have 
+  Next we construct an \emph{initial equational system} that
+  contains an equation for each equivalence class. We first give
+  an informal description of this construction. Suppose we have 
   the equivalence classes @{text "X\<^isub>1,\<dots>,X\<^isub>n"}, there must be one and only one that
   contains the empty string @{text "[]"} (since equivalence classes are disjoint).
   Let us assume @{text "[] \<in> X\<^isub>1"}. We build the following equational system
@@ -464,9 +465,13 @@
   \noindent
   where the terms @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"}
   stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow>
-  X\<^isub>i"}.   There can only be
-  finitely many such terms in a right-hand side since by assumption there are only finitely many
-  equivalence classes and only finitely many characters.  The term @{text
+  X\<^isub>i"}. In terms of finite automata, every equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system
+  corresponds very roughly to a state whose name is @{text X\<^isub>i} and its predecessor states 
+  are the @{text "Y\<^isub>i\<^isub>j"}; the @{text "c\<^isub>i\<^isub>j"} are the labels of the transitions from these 
+  predecessor states to @{text X\<^isub>i}. In our initial equation system there can only be
+  finitely many terms of the form @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} in a right-hand side 
+  since by assumption there are only finitely many
+  equivalence classes and only finitely many characters. The term @{text
   "\<lambda>(EMPTY)"} in the first equation acts as a marker for the equivalence class
   containing @{text "[]"}.\footnote{Note that we mark, roughly speaking, the
   single `initial' state in the equational system, which is different from
@@ -475,7 +480,7 @@
   way, because the Myhill-Nerode relation determines the `direction' of the
   transitions---the successor `state' of an equivalence class @{text Y} can
   be reached by adding a character to the end of @{text Y}. This is also the
-  reason why we have to use our reverse version of Arden's lemma.}
+  reason why we have to use our reverse version of Arden's Lemma.}
   Overloading the function @{text \<calL>} for the two kinds of terms in the
   equational system, we have
   
@@ -571,8 +576,8 @@
   then we calculate the combined regular expressions for all @{text r} coming 
   from the deleted @{text "(X, r)"}, and take the @{const STAR} of it;
   finally we append this regular expression to @{text rhs'}. It can be easily seen 
-  that this operation mimics Arden's lemma on the level of equations. To ensure
-  the non-emptiness condition of Arden's lemma we say that a right-hand side is
+  that this operation mimics Arden's Lemma on the level of equations. To ensure
+  the non-emptiness condition of Arden's Lemma we say that a right-hand side is
   @{text ardenable} provided
 
   \begin{center}
@@ -580,7 +585,7 @@
   \end{center}
 
   \noindent
-  This allows us to prove a version of Arden's lemma on the level of equations.
+  This allows us to prove a version of Arden's Lemma on the level of equations.
 
   \begin{lemma}\label{ardenable}
   Given an equation @{text "X = rhs"}.
@@ -591,7 +596,9 @@
   \end{lemma}
   
   \noindent
-  The \emph{substitution-operation} takes an equation
+  Our @{text ardenable} condition is slightly stronger than needed for applying Arden's Lemma,
+  but we can still ensure that it holds troughout our algorithm of transforming equations
+  into solved form. The \emph{substitution-operation} takes an equation
   of the form @{text "X = xrhs"} and substitutes it into the right-hand side @{text rhs}.
 
   \begin{center}
@@ -710,7 +717,7 @@
   equations is preserved under our transformations. The other properties are a bit more
   technical, but are needed to get our proof through. Distinctness states that every
   equation in the system is distinct. @{text Ardenable} ensures that we can always
-  apply the arden operation. 
+  apply the @{text Arden} operation. 
   The last property states that every @{text rhs} can only contain equivalence classes
   for which there is an equation. Therefore @{text lhss} is just the set containing 
   the first components of an equational system,
@@ -742,7 +749,7 @@
   \end{lemma}
 
   \begin{proof} 
-  This boils down to choosing an equation @{text "Y = yrhs"} to be eliminated
+  The argument boils down to choosing an equation @{text "Y = yrhs"} to be eliminated
   and to show that @{term "Subst_all (ES - {(Y, yrhs)}) Y (Arden Y yrhs)"} 
   preserves the invariant.
   We prove this as follows:
@@ -753,7 +760,7 @@
   \end{center}
 
   \noindent
-  Finiteness is straightforward, as @{const Subst} and @{const Arden} operations 
+  Finiteness is straightforward, as the @{const Subst} and @{const Arden} operations 
   keep the equational system finite. These operations also preserve soundness 
   and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
   The property @{text ardenable} is clearly preserved because the append-operation
@@ -829,7 +836,7 @@
   know @{text "X = \<Union>\<calL> ` rhs"}. We further know that
   this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the 
   invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
-  we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the arden operation
+  we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation
   removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
   This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
   So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = L (\<Uplus>rs)"}.
@@ -1104,8 +1111,8 @@
   \end{center}
   %
   \noindent
-  Either there is a prefix of @{text x} in @{text A} and the rest is in @{text B},
-  or @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B}.
+  Either there is a prefix of @{text x} in @{text A} and the rest is in @{text B} (first picture),
+  or @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} (second picture).
   In both cases we have to show that @{term "y @ z \<in> A ;; B"}. For this we use the 
   following tagging-function
   %
@@ -1347,19 +1354,19 @@
   choice has consequences about how the initial equational system is set up. We have
   the $\lambda$-term on our `initial state', while Brzozowski has it on the
   terminal states. This means we also need to reverse the direction of Arden's
-  lemma.
+  Lemma.
 
   We briefly considered using the method Brzozowski presented in the Appendix
   of~\cite{Brzozowski64} in order to prove the second direction of the
   Myhill-Nerode theorem. There he calculates the derivatives for regular
-  expressions and shows that there can be only finitely many of them (if regarded equal 
-  modulo ACI). We could
-  have used as the tag of a string @{text s} the derivative of a regular expression
-  generated with respect to @{text s}.  Using the fact that two strings are
+  expressions and shows that there can be only finitely many of them with 
+  respect to a language (if regarded equal modulo ACI). We could
+  have used as the tag of a string @{text s} the set of derivatives of a regular expression
+  generated by a language.  Using the fact that two strings are
   Myhill-Nerode related whenever their derivative is the same, together with
-  the fact that there are only finitely many derivatives for a regular
-  expression would give us a similar argument as ours. However it seems not so easy to
-  calculate the derivatives and then to count them. Therefore we preferred our
+  the fact that there are only finitely such derivatives
+  would give us a similar argument as ours. However it seems not so easy to
+  calculate the set of derivatives modulo ACI and then to count them. Therefore we preferred our
   direct method of using tagging-functions. This
   is also where our method shines, because we can completely side-step the
   standard argument \cite{Kozen97} where automata need to be composed, which