Journal/Paper.thy
changeset 379 8c4b6fb43ebe
parent 378 a0bcf886b8ef
child 381 99161cd17c0f
--- a/Journal/Paper.thy	Fri Jul 05 12:07:48 2013 +0100
+++ b/Journal/Paper.thy	Fri Jul 05 17:19:17 2013 +0100
@@ -3,6 +3,18 @@
 imports "../Closures2" "../Attic/Prefix_subtract" 
 begin
 
+lemma nullable_iff:
+  shows "nullable r \<longleftrightarrow> [] \<in> lang r"
+by (induct r) (auto simp add: conc_def split: if_splits)
+
+lemma Deriv_deriv:
+  shows "Deriv c (lang r) = lang (deriv c r)"
+by (induct r) (simp_all add: nullable_iff)
+
+lemma Derivs_derivs:
+  shows "Derivs s (lang r) = lang (derivs s r)"
+by (induct s arbitrary: r) (simp_all add: Deriv_deriv)
+
 declare [[show_question_marks = false]]
 
 consts
@@ -218,7 +230,7 @@
 
   A popular choice for a theorem prover would be one based on Higher-Order
   Logic (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development
-  presented in this paper we will use the Isabelle/HOL. HOL is a predicate calculus
+  presented in this paper we will use the Isabelle/HOL system. HOL is a predicate calculus
   that allows quantification over predicate variables. Its type system is
   based on the Simple Theory of Types by \citeN{Church40}.  Although many
   mathematical concepts can be conveniently expressed in HOL, there are some
@@ -231,10 +243,10 @@
   define most notions in terms of them.  For example, a regular language is
   normally defined as:
 
-  \begin{dfntn}\label{baddef}
+  \begin{definition}\label{baddef}
   A language @{text A} is \emph{regular}, provided there is a 
   finite deterministic automaton that recognises all strings of @{text "A"}.
-  \end{dfntn}
+  \end{definition}
 
   \noindent  
   This approach has many benefits. Among them is the fact that it is easy to
@@ -388,9 +400,9 @@
   the link between regular expressions and automata in the context of
   lexing. \citeN{BerghoferReiter09} use functions as well for formalising automata
   working over bit strings in the context of Presburger arithmetic.  
-  A Larger formalisation of automata theory, including the Myhill-Nerode theorem, 
+  A larger formalisation of automata theory, including the Myhill-Nerode theorem, 
   was carried out in Nuprl by \citeN{Constable00} which also uses functions.
-  Other large formailsations of automata theory in Coq are by 
+  Other large formalisations of automata theory in Coq are by 
   \citeN{Filliatre97} who essentially uses graphs and by \citeN{Almeidaetal10}
   and \citeN{Braibant12}, who both use matrices. Many of these works,
   like \citeN{Nipkow98} or \citeN{Braibant12}, mention intrinsic problems with 
@@ -423,24 +435,45 @@
   exists an automaton that recognises all its strings, we define a
   regular language as:
 
-  \begin{dfntn}\label{regular}
+  \begin{definition}\label{regular}
   A language @{text A} is \emph{regular}, provided there is a regular expression 
   that matches all strings of @{text "A"}.
-  \end{dfntn}
+  \end{definition}
   
   \noindent
-  And then `forget' automata completely.
-  The reason is that regular expressions %, unlike graphs, matrices and
-  %functions, 
-  can be defined as an inductive datatype and a reasoning
-  infrastructure for them (like induction and recursion) comes for free in
-  HOL. %Moreover, no side-conditions will be needed for regular expressions,
+  We then want to `forget' automata completely and see how far we come
+  with formalising results from regular language theory.  The reason 
+  is that regular expressions, unlike graphs, matrices and
+  functions, can be defined as an inductive datatype and a reasoning
+  infrastructure for them (like induction and recursion) comes for
+  free in HOL. But this question whether formal language theory can be
+  done without automata crops up also in non-theorem prover
+  contexts. For example, Gasarch asked in the Computational Complexity
+  blog by \citeN{GasarchBlog} whether the complementation of
+  regular-expression languages can be proved without using
+  automata. He concluded
+
+  \begin{quote}
+  ``{\it \ldots it can't be done}''
+  \end{quote} 
+
+  \noindent
+  and even pondered
+
+  \begin{quote}
+  ``{\it \ldots [b]ut is there a rigorous way to even state that?}'' 
+  \end{quote} 
+
+  %Moreover, no side-conditions will be needed for regular expressions,
   %like we need for automata. 
-  This convenience of regular expressions has
+  \noindent
+  We will give an answer to these questions in this paper.
+
+  The convenience of regular expressions has
   recently been exploited in HOL4 with a formalisation of regular expression
   matching based on derivatives by \citeN{OwensSlind08}, and with an equivalence
   checker for regular expressions in Isabelle/HOL by \citeN{KraussNipkow11}
-  and in Matida by \citeN{Asperti12} and in Coq by \citeN{CoquandSiles12}.  The
+  and in Matita by \citeN{Asperti12} and in Coq by \citeN{CoquandSiles12}.  The
   main purpose of this paper is to show that a central result about regular
   languages---the Myhill-Nerode Theorem---can be recreated by only using
   regular expressions. This theorem gives necessary and sufficient conditions
@@ -458,13 +491,13 @@
   an argument about solving equational systems.  This argument appears to be
   folklore. For the other part, we give two proofs: one direct proof using
   certain tagging-functions, and another indirect proof using Antimirov's
-  partial derivatives \citeyear{Antimirov95}. Again to our best knowledge, the
+  partial derivatives (\citeyear{Antimirov95}). Again to our best knowledge, the
   tagging-functions have not been used before for establishing the Myhill-Nerode
   Theorem. Derivatives of regular expressions have been used recently quite
   widely in the literature; partial derivatives, in contrast, attract much
   less attention. However, partial derivatives are more suitable in the
-  context of the Myhill-Nerode Theorem, since it is easier to establish
-  formally their finiteness result. We are not aware of any proof that uses
+  context of the Myhill-Nerode Theorem, since it is easier to 
+  formally establish their finiteness result. We are not aware of any proof that uses
   either of them for proving the Myhill-Nerode Theorem.
 *}
 
@@ -499,16 +532,16 @@
   In the paper
   we will make use of the following properties of these constructions.
   
-  \begin{prpstn}\label{langprops}\mbox{}\\
+  \begin{proposition}\label{langprops}\mbox{}\\
   \begin{tabular}{@ {}lp{10cm}}
   (i)   & @{thm star_unfold_left}     \\ 
   (ii)  & @{thm[mode=IfThen] pow_length}\\
   (iii) & @{thm conc_Union_left} \\ 
   (iv)  & If @{thm (prem 1) star_decom} and @{thm (prem 2) star_decom} then
           there exists an @{text "x\<^isub>p"} and @{text "x\<^isub>s"} with @{text "x = x\<^isub>p @ x\<^isub>s"} 
-          and @{term "x\<^isub>p \<noteq> []"} such that @{term "x\<^isub>p \<in> A"} and @{term "x\<^isub>s \<in> A\<star>"}.
+          and \mbox{@{term "x\<^isub>p \<noteq> []"}} such that @{term "x\<^isub>p \<in> A"} and @{term "x\<^isub>s \<in> A\<star>"}.
   \end{tabular}
-  \end{prpstn}
+  \end{proposition}
 
   \noindent
   In @{text "(ii)"} we use the notation @{term "length s"} for the length of a
@@ -534,16 +567,19 @@
   which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
   @{term "[] \<notin> A"}. However we will need the following `reversed' 
   version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \<cdot> X"} to
-  \mbox{@{term "X \<cdot> A"}}).\footnote{The details of the proof for the reversed Arden's Lemma
-  are given in the Appendix.}
+  \mbox{@{term "X \<cdot> A"}}).
+  %\footnote{The details of the proof for the reversed Arden's Lemma
+  %are given in the Appendix.}
 
-  \begin{lmm}[Reversed Arden's Lemma]\label{arden}\mbox{}\\
+  \begin{lemma}[Reversed Arden's Lemma]\label{arden}\mbox{}\\
   If @{thm (prem 1) reversed_Arden} then
   @{thm (lhs) reversed_Arden} if and only if
   @{thm (rhs) reversed_Arden}.
-  \end{lmm}
+  \end{lemma}
 
   \noindent
+  The proof of this reversed version of Arden's lemma follows the proof of the
+  original version.
   Regular expressions are defined as the inductive datatype
 
   \begin{center}
@@ -576,7 +612,7 @@
 
   Given a finite set of regular expressions @{text rs}, we will make use of the operation of generating 
   a regular expression that matches the union of all languages of @{text rs}. 
-  This definion is not trivial in a theorem prover, because @{text rs} (being a set) is unordered, 
+  This definition is not trivial in a theorem prover, because @{text rs} (being a set) is unordered, 
   but the regular expression needs an order. Since 
   we only need to know the 
   existence
@@ -607,17 +643,17 @@
 text {*
   \noindent
   The key definition in the Myhill-Nerode Theorem is the
-  \emph{Myhill-Nerode Relation}, which states that w.r.t.~a language two 
-  strings are related, provided there is no distinguishing extension in this
+  \emph{Myhill-Nerode Relation}, which states that  two 
+  strings are related w.r.t.~a language, provided there is no distinguishing extension in this
   language. This can be defined as a ternary relation.
 
-  \begin{dfntn}[Myhill-Nerode Relation]\label{myhillneroderel} 
+  \begin{definition}[Myhill-Nerode Relation]\label{myhillneroderel} 
   Given a language @{text A}, two strings @{text x} and
   @{text y} are Myhill-Nerode related provided
   \begin{center}
   @{thm str_eq_def'}
   \end{center}
-  \end{dfntn}
+  \end{definition}
 
   \noindent
   It is easy to see that @{term "\<approx>A"} is an equivalence relation, which
@@ -640,13 +676,13 @@
   that if there are finitely many equivalence classes, like in the example above, then 
   the language is regular. In our setting we therefore have to show:
   
-  \begin{thrm}\label{myhillnerodeone}
+  \begin{theorem}\label{myhillnerodeone}
   @{thm[mode=IfThen] Myhill_Nerode1}
-  \end{thrm}
+  \end{theorem}
 
   \noindent
   To prove this theorem, we first define the set @{term "finals A"} as those equivalence
-  classes from @{term "UNIV // \<approx>A"} that contain strings of @{text A}, namely
+  classes from \mbox{@{term "UNIV // \<approx>A"}} that contain strings of @{text A}, namely
   %
   \begin{equation} 
   @{thm finals_def}
@@ -683,8 +719,11 @@
   \noindent
   which means that if we append the character @{text c} to the end of all 
   strings in the equivalence class @{text Y}, we obtain a subset of 
-  @{text X}. Note that we do not define an automaton here, we merely relate two sets
-  (with the help of a character). In our concrete example we have 
+  @{text X}. 
+  %Note that we do not define formally an automaton here, 
+  %we merely relate two sets
+  %(with the help of a character). 
+  In our concrete example we have 
   @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<^isub>i\<Rightarrow> X\<^isub>3"} with @{text "d\<^isub>i"} being any 
   other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>c\<^isub>j\<Rightarrow> X\<^isub>3"} for any 
   character @{text "c\<^isub>j"}.
@@ -706,7 +745,7 @@
   \end{center}
 
   \noindent
-  where the terms @{text "(Y\<^isub>i\<^isub>j, ATOM c\<^isub>i\<^isub>j)"} are pairs consiting of an equivalence class and
+  where the terms @{text "(Y\<^isub>i\<^isub>j, ATOM c\<^isub>i\<^isub>j)"} are pairs consisting of an equivalence class and
   a regular expression. In the initial equational system, they
   stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow>
   X\<^isub>i"}. 
@@ -796,9 +835,9 @@
   for representing the right-hand sides of equations, we can 
   prove \eqref{inv1} and \eqref{inv2} more concisely as
   %
-  \begin{lmm}\label{inv}
+  \begin{lemma}\label{inv}
   If @{thm (prem 1) test} then @{text "X = \<Union> \<calL> ` rhs"}.
-  \end{lmm}
+  \end{lemma}
 
   \noindent
   Our proof of Theorem~\ref{myhillnerodeone} will proceed by transforming the
@@ -876,13 +915,13 @@
   \noindent
   This allows us to prove a version of Arden's Lemma on the level of equations.
 
-  \begin{lmm}\label{ardenable}
+  \begin{lemma}\label{ardenable}
   Given an equation @{text "X = rhs"}.
   If @{text "X = \<Union>\<calL> ` rhs"},
   @{thm (prem 2) Arden_preserves_soundness}, and
   @{thm (prem 3) Arden_preserves_soundness}, then
   @{text "X = \<Union>\<calL> ` (Arden X rhs)"}.
-  \end{lmm}
+  \end{lemma}
   
   \noindent
   Our @{text ardenable} condition is slightly stronger than needed for applying Arden's Lemma,
@@ -1027,9 +1066,9 @@
   It is straightforward to prove that the initial equational system satisfies the
   invariant.
 
-  \begin{lmm}\label{invzero}
+  \begin{lemma}\label{invzero}
   @{thm[mode=IfThen] Init_ES_satisfies_invariant}
-  \end{lmm}
+  \end{lemma}
 
   \begin{proof}
   Finiteness is given by the assumption and the way how we set up the 
@@ -1042,12 +1081,12 @@
   \noindent
   Next we show that @{text Iter} preserves the invariant.
 
-  \begin{lmm}\label{iterone} If
+  \begin{lemma}\label{iterone} If
   @{thm (prem 1) iteration_step_invariant[where xrhs="rhs"]},
   @{thm (prem 2) iteration_step_invariant[where xrhs="rhs"]} and
   @{thm (prem 3) iteration_step_invariant[where xrhs="rhs"]}, then
   @{thm (concl) iteration_step_invariant[where xrhs="rhs"]}.
-  \end{lmm}
+  \end{lemma}
 
   \begin{proof} 
   The argument boils down to choosing an equation @{text "Y = yrhs"} to be eliminated
@@ -1079,12 +1118,12 @@
   \noindent
   We also need the fact that @{text Iter} decreases the termination measure.
 
-  \begin{lmm}\label{itertwo} If
+  \begin{lemma}\label{itertwo} If
   @{thm (prem 1) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]},
   @{thm (prem 2) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]} and
   @{thm (prem 3) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}, then\\
   \mbox{@{thm (concl) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}}.
-  \end{lmm}
+  \end{lemma}
 
   \begin{proof}
   By assumption we know that @{text "ES"} is finite and has more than one element.
@@ -1099,11 +1138,11 @@
   This brings us to our property we want to establish for @{text Solve}.
 
 
-  \begin{lmm}
+  \begin{lemma}
   If @{thm (prem 1) Solve} and @{thm (prem 2) Solve} then there exists
   a @{text rhs} such that  @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"}
   and @{term "invariant {(X, rhs)}"}.
-  \end{lmm}
+  \end{lemma}
 
   \begin{proof} 
   In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly
@@ -1119,7 +1158,7 @@
   Premise 3 of~\eqref{whileprinciple} is by Lemma~\ref{itertwo}. Now in premise 4
   we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"}
   and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"}
-  does not holds. By the stronger invariant we know there exists such a @{text "rhs"}
+  does not hold. By the stronger invariant we know there exists such a @{text "rhs"}
   with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality
   of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"},
   for which the invariant holds. This allows us to conclude that 
@@ -1131,9 +1170,9 @@
   With this lemma in place we can show that for every equivalence class in @{term "UNIV // \<approx>A"}
   there exists a regular expression.
 
-  \begin{lmm}\label{every_eqcl_has_reg}
+  \begin{lemma}\label{every_eqcl_has_reg}
   @{thm[mode=IfThen] every_eqcl_has_reg}
-  \end{lmm}
+  \end{lemma}
 
   \begin{proof}
   By the preceding lemma, we know that there exists a @{text "rhs"} such
@@ -1181,9 +1220,9 @@
   In this section we will give a proof for establishing the second 
   part of the Myhill-Nerode Theorem. It can be formulated in our setting as follows:
 
-  \begin{thrm}\label{myhillnerodetwo}
+  \begin{theorem}\label{myhillnerodetwo}
   Given @{text "r"} is a regular expression, then @{thm Myhill_Nerode2}.
-  \end{thrm}  
+  \end{theorem}  
 
   \noindent
   The proof will be by induction on the structure of @{text r}. It turns out
@@ -1209,7 +1248,7 @@
   \noindent
   Much more interesting, however, are the inductive cases. They seem hard to be solved 
   directly. The reader is invited to try.\footnote{The induction hypothesis is not strong enough 
-  to make any progress with the TIME and STAR cases.} 
+  to make any progress with the @{text TIMES} and @{text STAR} cases.} 
 
   In order to see how our proof proceeds consider the following suggestive picture 
   given by \citeN{Constable00}:
@@ -1250,10 +1289,10 @@
   suffices to show in each induction step that another relation, say @{text
   R}, has finitely many equivalence classes and refines @{term "\<approx>(lang r)"}. 
 
-  \begin{dfntn}
+  \begin{definition}
   A relation @{text "R\<^isub>1"} \emph{refines} @{text "R\<^isub>2"}
   provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}. 
-  \end{dfntn}
+  \end{definition}
 
   \noindent
   For constructing @{text R}, we will rely on some \emph{tagging-functions}
@@ -1275,12 +1314,12 @@
   as follows.
 
 
-  \begin{dfntn}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
+  \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
   and @{text y} are \emph{tag-related} provided
   \begin{center}
   @{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y \<equiv> tag x = tag y"}\;.
   \end{center}
-  \end{dfntn}
+  \end{definition}
 
 
   In order to establish finiteness of a set @{text A}, we shall use the following powerful
@@ -1295,9 +1334,9 @@
   is finite, then the set @{text A} itself must be finite. We can use it to establish the following 
   two lemmas.
 
-  \begin{lmm}\label{finone}
+  \begin{lemma}\label{finone}
   @{thm[mode=IfThen] finite_eq_tag_rel}
-  \end{lmm}
+  \end{lemma}
 
   \begin{proof}
   We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have
@@ -1312,12 +1351,12 @@
   with @{thm (concl) finite_eq_tag_rel}.
   \end{proof}
 
-  \begin{lmm}\label{fintwo} 
+  \begin{lemma}\label{fintwo} 
   Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby
   @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}.
   If @{thm (prem 1) refined_partition_finite[where ?R1.0="R\<^isub>1" and ?R2.0="R\<^isub>2"]}
   then @{thm (concl) refined_partition_finite[where ?R1.0="R\<^isub>1" and ?R2.0="R\<^isub>2"]}.
-  \end{lmm}
+  \end{lemma}
 
   \begin{proof}
   We prove this lemma again using \eqref{finiteimageD}. This time we set @{text f} to
@@ -1368,11 +1407,11 @@
   The @{const TIMES}-case is slightly more complicated. We first prove the
   following lemma, which will aid the proof about refinement.
 
-  \begin{lmm}\label{refinement}
+  \begin{lemma}\label{refinement}
   The relation @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} refines @{term "\<approx>A"}, provided for
   all strings @{text x}, @{text y} and @{text z} we have that \mbox{@{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y"}}
   and @{term "x @ z \<in> A"} imply @{text "y @ z \<in> A"}.
-  \end{lmm}
+  \end{lemma}
 
 
   \noindent
@@ -1390,7 +1429,9 @@
   refer to @{text "x\<^isub>p"} as the \emph{prefix} of the string @{text x},
   and respectively to @{text "x\<^isub>s"} as the \emph{suffix}.
 
+*}
 
+text {*
   Now assuming  @{term "x @ z \<in> A \<cdot> B"}, there are only two possible ways of how to `split' 
   this string to be in @{term "A \<cdot> B"}:
   %
@@ -1452,18 +1493,20 @@
   \end{tabular}
   \end{center}
   %
+
   \noindent
   Either @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} 
   (first picture) or there is a prefix of @{text x} in @{text A} and the rest is in @{text B} 
   (second picture). In both cases we have to show that @{term "y @ z \<in> A \<cdot> B"}. The first case
-  we will only go through if we know that  @{term "x \<approx>A y"} holds @{text "(*)"}. Because then 
+  we will only go through if we know that  @{term "x \<approx>A y"} holds @{text "("}@{text "*"}@{text ")"}.  
+  Because then 
   we can infer from @{term "x @ z\<^isub>p \<in> A"} that @{term "y @ z\<^isub>p \<in> A"} holds for all @{text "z\<^isub>p"}.
   In the second case we only know that @{text "x\<^isub>p"} and @{text "x\<^isub>s"} is one possible partition
   of the string @{text x}. We have to know that both @{text "x\<^isub>p"} and the
   corresponding partition @{text "y\<^isub>p"} are in @{text "A"}, and that @{text "x\<^isub>s"} is `@{text B}-related' 
-  to @{text "y\<^isub>s"} @{text "(**)"}. From the latter fact we can infer that @{text "y\<^isub>s @ z \<in> B"}.
+  to @{text "y\<^isub>s"} @{text "("}@{text "**"}@{text ")"}. From the latter fact we can infer that @{text "y\<^isub>s @ z \<in> B"}.
   This will solve the second case.
-  Taking the two requirements, @{text "(*)"} and @{text "(**)"}, together we define the
+  Taking the two requirements, @{text "("}@{text "*"}@{text ")"} and @{text "(**)"}, together we define the
   tagging-function in the @{const Times}-case as:
 
   \begin{center}
@@ -1476,7 +1519,6 @@
   not know anything about how the string @{term x} is partitioned.
   With this definition in place, let us prove the @{const "Times"}-case.
 
-
   \begin{proof}[@{const TIMES}-Case]
   If @{term "finite (UNIV // \<approx>A)"} and @{term "finite (UNIV // \<approx>B)"}
   then @{term "finite ((UNIV // \<approx>A) \<times> (Pow (UNIV // \<approx>B)))"} holds. The range of
@@ -1884,7 +1926,7 @@
   induction hypothesis is
 
   \begin{center}
-  @{text "(IH)"}\hspace{3mm}@{term "\<forall>r. Derivs s (lang r) = \<Union> lang ` (pderivs s r)"}
+  @{text "(IH)"}\hspace{3mm}@{term "\<forall>r. Derivs s (lang r) = \<Union> (lang ` (pderivs s r))"}
   \end{center}
 
   \noindent
@@ -1894,10 +1936,10 @@
   \begin{tabular}{r@ {\hspace{1.5mm}}c@ {\hspace{1.5mm}}ll}
   @{term "Derivs (c # s) (lang r)"} 
     & @{text "="} & @{term "Derivs s (Deriv c (lang r))"} & by def.\\
-    & @{text "="} & @{term "Derivs s (\<Union> lang ` (pderiv c r))"} & by @{text "("}\ref{Derspders}@{text ".i)"}\\
-    & @{text "="} & @{term "\<Union> (Derivs s) ` (lang ` (pderiv c r))"} & by def.~of @{text "Ders"}\\
-    & @{text "="} & @{term "\<Union> lang ` (\<Union> pderivs s ` (pderiv c r))"} & by IH\\
-    & @{text "="} & @{term "\<Union> lang ` (pderivs (c # s) r)"} & by def.\\
+    & @{text "="} & @{term "Derivs s (\<Union> (lang ` (pderiv c r)))"} & by @{text "("}\ref{Derspders}@{text ".i)"}\\
+    & @{text "="} & @{term "\<Union> ((Derivs s) ` (lang ` (pderiv c r)))"} & by def.~of @{text "Ders"}\\
+    & @{text "="} & @{term "\<Union> (lang ` (\<Union> (pderivs s ` (pderiv c r))))"} & by IH\\
+    & @{text "="} & @{term "\<Union> (lang ` (pderivs (c # s) r))"} & by def.\\
   \end{tabular}
   \end{center}
   
@@ -1931,10 +1973,10 @@
   @{thm pderivs_lang_def}
   \end{equation}
 
-  \begin{thrm}[\cite{Antimirov95}]\label{antimirov}
+  \begin{theorem}[\cite{Antimirov95}]\label{antimirov}
   For every language @{text A} and every regular expression @{text r}, 
   \mbox{@{thm finite_pderivs_lang}}.
-  \end{thrm}
+  \end{theorem}
 
   \noindent
   Antimirov's proof first establishes this theorem for the language @{term UNIV1}, 
@@ -2101,7 +2143,7 @@
   and \eqref{Derspders} we have
   %
   \begin{equation}\label{eqq}
-  @{term "Deriv_lang B (lang r) = (\<Union> lang ` (pderivs_lang B r))"}
+  @{term "Deriv_lang B (lang r) = (\<Union> (lang ` (pderivs_lang B r)))"}
   \end{equation}
 
   \noindent
@@ -2145,11 +2187,11 @@
   \noindent
   We like to establish
 
-  \begin{thrm}[\cite{Haines69}]\label{subseqreg}
+  \begin{theorem}[\cite{Haines69}]\label{subseqreg}
   For every language @{text A}, the languages @{text "(i)"} @{term "SUBSEQ A"} and 
   @{text "(ii)"} @{term "SUPSEQ A"}
   are regular.
-  \end{thrm}
+  \end{theorem}
 
   \noindent
   Our proof follows the one given by \citeN[Pages 92--95]{Shallit08}, except that we use
@@ -2182,9 +2224,9 @@
   whereby the last equation follows from the fact that @{term "A\<star>"} contains the
   empty string. With these properties at our disposal we can establish the lemma
 
-  \begin{lmm}
+  \begin{lemma}
   If @{text A} is regular, then also @{term "SUPSEQ A"}.
-  \end{lmm}
+  \end{lemma}
 
   \begin{proof}
   Since our alphabet is finite, we have a regular expression, written @{text ALL}, that
@@ -2212,12 +2254,12 @@
   \noindent
   Now we can prove the main lemma w.r.t.~@{const "SUPSEQ"}, namely
 
-  \begin{lmm}\label{mset}
+  \begin{lemma}\label{mset}
   For every language @{text A}, there exists a finite language @{text M} such that
   \begin{center}
   \mbox{@{term "SUPSEQ M = SUPSEQ A"}}\;.
   \end{center}
-  \end{lmm}
+  \end{lemma}
 
   \begin{proof}
   For @{text M} we take the set of all minimal elements of @{text A}. An element @{text x} 
@@ -2275,11 +2317,11 @@
   the non-regularity of languages. For this we use the following version of the Continuation
   Lemma (see for example~\cite{Rosenberg06}).
 
-  \begin{lmm}[Continuation Lemma]
+  \begin{lemma}[Continuation Lemma]
   If a language @{text A} is regular and a set of strings @{text B} is infinite,
   then there exist two distinct strings @{text x} and @{text y} in @{text B} 
   such that @{term "x \<approx>A y"}.
-  \end{lmm}
+  \end{lemma}
 
   \noindent
   This lemma can be easily deduced from the Myhill-Nerode Theorem and the Pigeonhole
@@ -2293,9 +2335,9 @@
   for the strings consisting of @{text n} times the character a; similarly for
   @{text "b\<^isup>n"}). For this consider the infinite set @{text "B \<equiv> \<Union>\<^isub>n a\<^sup>n"}.
 
-  \begin{lmm}
+  \begin{lemma}
   No two distinct strings in set @{text "B"} are Myhill-Nerode related by language @{text A}.
-  \end{lmm} 
+  \end{lemma} 
 
   \begin{proof}
   After unfolding the definition of @{text "B"}, we need to establish that given @{term "i \<noteq> j"},
@@ -2344,9 +2386,9 @@
   We have established in Isabelle/HOL both directions 
   of the Myhill-Nerode Theorem.
   %
-  \begin{thrm}[Myhill-Nerode Theorem]\mbox{}\\
+  \begin{theorem}[Myhill-Nerode Theorem]\mbox{}\\
   A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
-  \end{thrm}
+  \end{theorem}
   
   \noindent
   Having formalised this theorem means we pushed our point of view quite
@@ -2452,8 +2494,8 @@
   algorithm is still executable. Given the infrastructure for
   executable sets  introduced by \citeN{Haftmann09} in Isabelle/HOL, it should.
 
-  We started out by claiming that in a theorem prover it is eaiser to
-  reason about regular expressions than about automta. Here are some
+  We started out by claiming that in a theorem prover it is easier to
+  reason about regular expressions than about automata. Here are some
   numbers: Our formalisation of the Myhill-Nerode Theorem consists of
   780 lines of Isabelle/Isar code for the first direction and 460 for
   the second (the one based on tagging-functions), plus around 300
@@ -2536,50 +2578,50 @@
   We are grateful for the comments we received from Larry Paulson.  Tobias
   Nipkow made us aware of the properties in Theorem~\ref{subseqreg} and Tjark
   Weber helped us with proving them. Christian Sternagel provided us with a
-  version of Higman's Lemma that applies to arbtrary, but finite alphabets. 
+  version of Higman's Lemma that applies to arbitrary, but finite alphabets. 
 
   \bibliographystyle{acmtrans}
   \bibliography{root}
 
-  \newpage
-  \begin{appendix}
-  \section{Appendix$^\star$}
+  %\newpage
+  %\begin{appendix}
+  %\section{Appendix$^\star$}
 
-  \renewcommand{\thefootnote}{\mbox{$\star$}}
-  \footnotetext{If the reviewers deem more suitable, the authors are
-  prepared to drop material or move it to an electronic appendix.}
+  %\renewcommand{\thefootnote}{\mbox{$\star$}}
+  %\footnotetext{If the reviewers deem more suitable, the authors are
+  %prepared to drop material or move it to an electronic appendix.}
   
-  \begin{proof}[of Lemma~\ref{arden}]
-  For the right-to-left direction we assume @{thm (rhs) reversed_Arden} and show
-  that @{thm (lhs) reversed_Arden} holds. From Property~\ref{langprops}@{text "(i)"} 
-  we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"},
-  which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both 
-  sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side
-  is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. Applying the assumed equation 
-  completes this direction. 
+  %\begin{proof}[of Lemma~\ref{arden}]
+  %For the right-to-left direction we assume @{thm (rhs) reversed_Arden} and show
+  %that @{thm (lhs) reversed_Arden} holds. From Property~\ref{langprops}@{text "(i)"} 
+  %we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"},
+  %which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both 
+  %sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side
+  %is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. Applying the assumed equation 
+  %completes this direction. 
 
-  For the other direction we assume @{thm (lhs) reversed_Arden}. By a simple induction
-  on @{text n}, we can establish the property
+  %For the other direction we assume @{thm (lhs) reversed_Arden}. By a simple induction
+  %on @{text n}, we can establish the property
 
-  \begin{center}
-  @{text "(*)"}\hspace{5mm} @{thm (concl) reversed_arden_helper}
-  \end{center}
+  %\begin{center}
+  %@{text "("}@{text "*"}@{text ")"}\hspace{5mm} @{thm (concl) reversed_arden_helper}
+  %\end{center}
   
-  \noindent
-  Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for
-  all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition
-  of @{text "\<star>"}.
-  For the inclusion in the other direction we assume a string @{text s}
-  with length @{text k} is an element in @{text X}. Since @{thm (prem 1) reversed_Arden}
-  we know by Property~\ref{langprops}@{text "(ii)"} that 
-  @{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k}
-  (the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer). 
-  From @{text "(*)"} it follows then that
-  @{term s} must be an element in @{term "(\<Union>m\<le>k. B \<cdot> (A \<up> m))"}. This in turn
-  implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Property~\ref{langprops}@{text "(iii)"} 
-  this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show.
-  \end{proof}
-  \end{appendix}
+  %\noindent
+  %Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for
+  %all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition
+  %of @{text "\<star>"}.
+  %For the inclusion in the other direction we assume a string @{text s}
+  %with length @{text k} is an element in @{text X}. Since @{thm (prem 1) reversed_Arden}
+  %we know by Property~\ref{langprops}@{text "(ii)"} that 
+  %@{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k}
+  %(the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer). 
+  %From @{text "("}@{text "*"}@{text ")"} it follows then that
+  %@{term s} must be an element in @{term "(\<Union>m\<le>k. B \<cdot> (A \<up> m))"}. This in turn
+  %implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Property~\ref{langprops}@{text "(iii)"} 
+  %this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show.
+  %\end{proof}
+  % \end{appendix}
 *}