diff -r 3c9129f49846 -r 990c12ab1562 Paper/Paper.thy
--- a/Paper/Paper.thy	Thu Apr 28 03:24:20 2011 +0000
+++ b/Paper/Paper.thy	Wed May 04 07:05:59 2011 +0000
@@ -76,8 +76,8 @@
   HOLlight support them with libraries. Even worse, reasoning about graphs and
   matrices can be a real hassle in HOL-based theorem provers.  Consider for
   example the operation of sequencing two automata, say $A_1$ and $A_2$, by
-  connecting the accepting states of $A_1$ to the initial state of $A_2$:  
-  
+  connecting the accepting states of $A_1$ to the initial state of $A_2$:\\[-5.5mm]  
+  %  
   \begin{center}
   \begin{tabular}{ccc}
   \begin{tikzpicture}[scale=0.8]
@@ -199,7 +199,7 @@
   are carried out in Nuprl \cite{Constable00} and in Coq \cite{Filliatre97}.
   
   In this paper, we will not attempt to formalise automata theory in
-  Isabelle/HOL, but take a completely different approach to regular
+  Isabelle/HOL, but take a different approach to regular
   languages. Instead of defining a regular language as one where there exists
   an automaton that recognises all strings of the language, we define a
   regular language as:
@@ -225,7 +225,7 @@
   \noindent
   {\bf Contributions:} 
   There is an extensive literature on regular languages.
-  To our knowledge, our proof of the Myhill-Nerode theorem is the
+  To our best knowledge, our proof of the Myhill-Nerode theorem is the
   first that is based on regular expressions, only. We prove the part of this theorem 
   stating that a regular expression has only finitely many partitions using certain 
   tagging-functions. Again to our best knowledge, these tagging-functions have
@@ -465,14 +465,18 @@
   \noindent
   where the terms @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"}
   stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow>
-  X\<^isub>i"}. If viewed as an automaton, then every equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system
-  corresponds roughly to a state whose name is @{text X\<^isub>i} and its predecessor states 
-  are the @{text "Y\<^isub>i\<^isub>j"}; the @{text "c\<^isub>i\<^isub>j"} are the labels of the transitions from these 
-  predecessor states to @{text X\<^isub>i}. In our initial equation system there can only be
+  X\<^isub>i"}. 
+  %The intuition behind the equational system is that every 
+  %equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system
+  %corresponds roughly to a state of an automaton whose name is @{text X\<^isub>i} and its predecessor states 
+  %are the @{text "Y\<^isub>i\<^isub>j"}; the @{text "c\<^isub>i\<^isub>j"} are the labels of the transitions from these 
+  %predecessor states to @{text X\<^isub>i}. 
+  There can only be
   finitely many terms of the form @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} in a right-hand side 
   since by assumption there are only finitely many
-  equivalence classes and only finitely many characters. The term @{text
-  "\<lambda>(EMPTY)"} in the first equation acts as a marker for the equivalence class
+  equivalence classes and only finitely many characters.
+  The term @{text "\<lambda>(EMPTY)"} in the first equation acts as a marker for the initial state, that
+  is the equivalence class
   containing @{text "[]"}.\footnote{Note that we mark, roughly speaking, the
   single `initial' state in the equational system, which is different from
   the method by Brzozowski \cite{Brzozowski64}, where he marks the
@@ -481,6 +485,10 @@
   transitions---the successor `state' of an equivalence class @{text Y} can
   be reached by adding a character to the end of @{text Y}. This is also the
   reason why we have to use our reverse version of Arden's Lemma.}
+  %In our initial equation system there can only be
+  %finitely many terms of the form @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} in a right-hand side 
+  %since by assumption there are only finitely many
+  %equivalence classes and only finitely many characters. 
   Overloading the function @{text \<calL>} for the two kinds of terms in the
   equational system, we have
   
@@ -501,7 +509,7 @@
   hold. Similarly for @{text "X\<^isub>1"} we can show the following equation
   %
   \begin{equation}\label{inv2}
-  @{text "X\<^isub>1 = \<calL>(Y\<^isub>i\<^isub>1, CHAR c\<^isub>i\<^isub>1) \<union> \<dots> \<union> \<calL>(Y\<^isub>i\<^isub>p, CHAR c\<^isub>i\<^isub>p) \<union> \<calL>(\<lambda>(EMPTY))"}.
+  @{text "X\<^isub>1 = \<calL>(Y\<^isub>1\<^isub>1, CHAR c\<^isub>1\<^isub>1) \<union> \<dots> \<union> \<calL>(Y\<^isub>1\<^isub>p, CHAR c\<^isub>1\<^isub>p) \<union> \<calL>(\<lambda>(EMPTY))"}.
   \end{equation}
 
   \noindent
@@ -653,7 +661,7 @@
 
   \noindent
   The last definition we need applies @{term Iter} over and over until a condition 
-  @{text Cond} is \emph{not} satisfied anymore. The condition states that there
+  @{text Cond} is \emph{not} satisfied anymore. This condition states that there
   are more than one equation left in the equational system @{text ES}. To solve
   an equational system we use Isabelle/HOL's @{text while}-operator as follows:
   
@@ -854,12 +862,7 @@
   in @{term "finals A"} there exists a regular expression. Moreover by assumption 
   we know that @{term "finals A"} must be finite, and therefore there must be a finite
   set of regular expressions @{text "rs"} such that
-
-  \begin{center}
-  @{term "\<Union>(finals A) = L (\<Uplus>rs)"}
-  \end{center}
-
-  \noindent
+  @{term "\<Union>(finals A) = L (\<Uplus>rs)"}.
   Since the left-hand side is equal to @{text A}, we can use @{term "\<Uplus>rs"} 
   as the regular expression that is needed in the theorem.\qed
   \end{proof}
@@ -917,7 +920,7 @@
   \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
   and @{text y} are \emph{tag-related} provided
   \begin{center}
-  @{text "x =tag= y \<equiv> tag x = tag y"}.
+  @{text "x =tag= y \<equiv> tag x = tag y"}\;.
   \end{center}
   \end{definition}
 
@@ -1041,11 +1044,9 @@
   and \emph{string subtraction}:
   %
   \begin{center}
-  \begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
-  @{text "[] - y"} & @{text "\<equiv>"} & @{text "[]"}\\
-  @{text "x - []"} & @{text "\<equiv>"} & @{text x}\\
-  @{text "cx - dy"} & @{text "\<equiv>"} & @{text "if c = d then x - y else cx"}\\
-  \end{tabular}
+  @{text "[] - y \<equiv> []"}\hspace{10mm}
+  @{text "x - [] \<equiv> x"}\hspace{10mm}
+  @{text "cx - dy \<equiv> if c = d then x - y else cx"}
   \end{center}
   %
   \noindent
@@ -1055,11 +1056,12 @@
   this string to be in @{term "A ;; B"}:
   %
   \begin{center}
+  \begin{tabular}{@ {}c@ {\hspace{10mm}}c@ {}}
   \scalebox{0.7}{
   \begin{tikzpicture}
-    \node[draw,minimum height=3.8ex] (xa) { $\hspace{4em}@{text "x'"}\hspace{4em}$ };
-    \node[draw,minimum height=3.8ex, right=-0.03em of xa] (xxa) { $\hspace{0.5em}@{text "x - x'"}\hspace{0.5em}$ };
-    \node[draw,minimum height=3.8ex, right=-0.03em of xxa] (z) { $\hspace{10.1em}@{text z}\hspace{10.1em}$ };
+    \node[draw,minimum height=3.8ex] (xa) { $\hspace{3em}@{text "x'"}\hspace{3em}$ };
+    \node[draw,minimum height=3.8ex, right=-0.03em of xa] (xxa) { $\hspace{0.2em}@{text "x - x'"}\hspace{0.2em}$ };
+    \node[draw,minimum height=3.8ex, right=-0.03em of xxa] (z) { $\hspace{5em}@{text z}\hspace{5em}$ };
 
     \draw[decoration={brace,transform={yscale=3}},decorate]
            (xa.north west) -- ($(xxa.north east)+(0em,0em)$)
@@ -1081,12 +1083,12 @@
            ($(xa.south east)+(0em,0ex)$) -- ($(xa.south west)+(0em,0ex)$)
                node[midway, below=0.5em]{@{term "x' \<in> A"}};
   \end{tikzpicture}}
-
+  &
   \scalebox{0.7}{
   \begin{tikzpicture}
-    \node[draw,minimum height=3.8ex] (x) { $\hspace{6.5em}@{text x}\hspace{6.5em}$ };
-    \node[draw,minimum height=3.8ex, right=-0.03em of x] (za) { $\hspace{2em}@{text "z'"}\hspace{2em}$ };
-    \node[draw,minimum height=3.8ex, right=-0.03em of za] (zza) { $\hspace{6.1em}@{text "z - z'"}\hspace{6.1em}$  };
+    \node[draw,minimum height=3.8ex] (x) { $\hspace{4.8em}@{text x}\hspace{4.8em}$ };
+    \node[draw,minimum height=3.8ex, right=-0.03em of x] (za) { $\hspace{0.6em}@{text "z'"}\hspace{0.6em}$ };
+    \node[draw,minimum height=3.8ex, right=-0.03em of za] (zza) { $\hspace{2.6em}@{text "z - z'"}\hspace{2.6em}$  };
 
     \draw[decoration={brace,transform={yscale=3}},decorate]
            (x.north west) -- ($(za.north west)+(0em,0em)$)
@@ -1108,6 +1110,7 @@
            ($(zza.south east)+(0em,0ex)$) -- ($(za.south east)+(0em,0ex)$)
                node[midway, below=0.5em]{@{text "(z - z') \<in> B"}};
   \end{tikzpicture}}
+  \end{tabular}
   \end{center}
   %
   \noindent
@@ -1307,7 +1310,8 @@
   \noindent
   holds for any strings @{text "s\<^isub>1"} and @{text
   "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"} give rise to the same
-  partitions.  Proving the existence of such a regular expression via automata would 
+  partitions.  Proving the existence of such a regular expression via automata 
+  using the standard method would 
   be quite involved. It includes the
   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
@@ -1359,22 +1363,28 @@
   We briefly considered using the method Brzozowski presented in the Appendix
   of~\cite{Brzozowski64} in order to prove the second direction of the
   Myhill-Nerode theorem. There he calculates the derivatives for regular
-  expressions and shows that there can be only finitely many of them with 
-  respect to a language (if regarded equal modulo ACI). We could
-  have used as the tag of a string @{text s} the set of derivatives of a regular expression
-  generated by a language.  Using the fact that two strings are
+  expressions and shows that for every language there can be only 
+  finitely many of them %derivations
+  (if regarded equal modulo ACI). We could
+  have used as tagging-function the set of derivatives of a regular expression
+  with respect to a language.  Using the fact that two strings are
   Myhill-Nerode related whenever their derivative is the same, together with
   the fact that there are only finitely such derivatives
   would give us a similar argument as ours. However it seems not so easy to
-  calculate the set of derivatives modulo ACI and then to count them. Therefore we preferred our
+  calculate the set of derivatives modulo ACI. Therefore we preferred our
   direct method of using tagging-functions. This
   is also where our method shines, because we can completely side-step the
   standard argument \cite{Kozen97} where automata need to be composed, which
-  as stated in the Introduction is not so convenient to formalise in a 
+  as stated in the Introduction is not so easy to formalise in a 
   HOL-based theorem prover. However, it is also the direction where we had to 
   spend most of the `conceptual' time, as our proof-argument based on tagging-functions
   is new for establishing the Myhill-Nerode theorem. All standard proofs 
-  of this direction proceed by arguments over automata.
+  of this direction use %proceed by 
+  arguments over automata.\\[-6mm]%\medskip
+  %
+  %\noindent
+  %{\bf Acknowledgements:} We are grateful for the comments we received from Larry
+  %Paulson and the referees of the paper.
 
 *}