regexp: comparison Journal/Paper.thy

equal deleted inserted replaced

-:a0bcf886b8ef
+:8c4b6fb43ebe
 (*<*)
 theory Paper
 imports "../Closures2" "../Attic/Prefix_subtract"
 begin
+lemma nullable_iff:
+shows "nullable r \<longleftrightarrow> [] \<in> lang r"
+by (induct r) (auto simp add: conc_def split: if_splits)
+lemma Deriv_deriv:
+shows "Deriv c (lang r) = lang (deriv c r)"
+by (induct r) (simp_all add: nullable_iff)
+lemma Derivs_derivs:
+shows "Derivs s (lang r) = lang (derivs s r)"
+by (induct s arbitrary: r) (simp_all add: Deriv_deriv)
 declare [[show_question_marks = false]]
 consts
 REL :: "(string \<times> string) set"
 It seems natural to exercise theorem provers by
 formalising the theorems and by verifying formally the algorithms.
 A popular choice for a theorem prover would be one based on Higher-Order
 Logic (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development
-presented in this paper we will use the Isabelle/HOL. HOL is a predicate calculus
+presented in this paper we will use the Isabelle/HOL system. HOL is a predicate calculus
 that allows quantification over predicate variables. Its type system is
 based on the Simple Theory of Types by \citeN{Church40}.  Although many
 mathematical concepts can be conveniently expressed in HOL, there are some
 limitations that hurt when attempting a simple-minded formalisation
 of regular languages in it.
 regular languages, taken for example by \citeN{HopcroftUllman69}
 and by \citeN{Kozen97},  is to introduce finite deterministic automata and then
 define most notions in terms of them.  For example, a regular language is
 normally defined as:
-\begin{dfntn}\label{baddef}
+\begin{definition}\label{baddef}
 A language @{text A} is \emph{regular}, provided there is a
 finite deterministic automaton that recognises all strings of @{text "A"}.
-\end{dfntn}
+\end{definition}
 \noindent
 This approach has many benefits. Among them is the fact that it is easy to
 convince oneself that regular languages are closed under complementation:
 one just has to exchange the accepting and non-accepting states in the
 Isabelle/HOL. There they use matrices for representing automata.
 Functions are used by \citeN{Nipkow98}, who establishes
 the link between regular expressions and automata in the context of
 lexing. \citeN{BerghoferReiter09} use functions as well for formalising automata
 working over bit strings in the context of Presburger arithmetic.
-A Larger formalisation of automata theory, including the Myhill-Nerode theorem,
+A larger formalisation of automata theory, including the Myhill-Nerode theorem,
 was carried out in Nuprl by \citeN{Constable00} which also uses functions.
-Other large formailsations of automata theory in Coq are by
+Other large formalisations of automata theory in Coq are by
 \citeN{Filliatre97} who essentially uses graphs and by \citeN{Almeidaetal10}
 and \citeN{Braibant12}, who both use matrices. Many of these works,
 like \citeN{Nipkow98} or \citeN{Braibant12}, mention intrinsic problems with
 their representation of
 automata which made them `fight' their respective  theorem prover.
 literature, but take a different approach to regular languages than is
 usually taken. Instead of defining a regular language as one where there
 exists an automaton that recognises all its strings, we define a
 regular language as:
-\begin{dfntn}\label{regular}
+\begin{definition}\label{regular}
 A language @{text A} is \emph{regular}, provided there is a regular expression
 that matches all strings of @{text "A"}.
-\end{dfntn}
+\end{definition}
 \noindent
-And then `forget' automata completely.
+We then want to `forget' automata completely and see how far we come
-The reason is that regular expressions %, unlike graphs, matrices and
+with formalising results from regular language theory.  The reason
-%functions,
+is that regular expressions, unlike graphs, matrices and
-can be defined as an inductive datatype and a reasoning
+functions, can be defined as an inductive datatype and a reasoning
-infrastructure for them (like induction and recursion) comes for free in
+infrastructure for them (like induction and recursion) comes for
-HOL. %Moreover, no side-conditions will be needed for regular expressions,
+free in HOL. But this question whether formal language theory can be
+done without automata crops up also in non-theorem prover
+contexts. For example, Gasarch asked in the Computational Complexity
+blog by \citeN{GasarchBlog} whether the complementation of
+regular-expression languages can be proved without using
+automata. He concluded
+\begin{quote}
+``{\it \ldots it can't be done}''
+\end{quote}
+\noindent
+and even pondered
+\begin{quote}
+``{\it \ldots [b]ut is there a rigorous way to even state that?}''
+\end{quote}
+%Moreover, no side-conditions will be needed for regular expressions,
 %like we need for automata.
-This convenience of regular expressions has
+\noindent
+We will give an answer to these questions in this paper.
+The convenience of regular expressions has
 recently been exploited in HOL4 with a formalisation of regular expression
 matching based on derivatives by \citeN{OwensSlind08}, and with an equivalence
 checker for regular expressions in Isabelle/HOL by \citeN{KraussNipkow11}
-and in Matida by \citeN{Asperti12} and in Coq by \citeN{CoquandSiles12}.  The
+and in Matita by \citeN{Asperti12} and in Coq by \citeN{CoquandSiles12}.  The
 main purpose of this paper is to show that a central result about regular
 languages---the Myhill-Nerode Theorem---can be recreated by only using
 regular expressions. This theorem gives necessary and sufficient conditions
 for when a language is regular. As a corollary of this theorem we can easily
 establish the usual closure properties, including complementation, for
 that is based on regular expressions, only. The part of this theorem stating
 that finitely many partitions imply regularity of the language is proved by
 an argument about solving equational systems.  This argument appears to be
 folklore. For the other part, we give two proofs: one direct proof using
 certain tagging-functions, and another indirect proof using Antimirov's
-partial derivatives \citeyear{Antimirov95}. Again to our best knowledge, the
+partial derivatives (\citeyear{Antimirov95}). Again to our best knowledge, the
 tagging-functions have not been used before for establishing the Myhill-Nerode
 Theorem. Derivatives of regular expressions have been used recently quite
 widely in the literature; partial derivatives, in contrast, attract much
 less attention. However, partial derivatives are more suitable in the
-context of the Myhill-Nerode Theorem, since it is easier to establish
+context of the Myhill-Nerode Theorem, since it is easier to
-formally their finiteness result. We are not aware of any proof that uses
+formally establish their finiteness result. We are not aware of any proof that uses
 either of them for proving the Myhill-Nerode Theorem.
 *}
 section {* Preliminaries *}
 where @{text "@"} is the list-append operation. The Kleene-star of a language @{text A}
 is defined as the union over all powers, namely @{thm star_def[where A="A", THEN eq_reflection]}.
 In the paper
 we will make use of the following properties of these constructions.
-\begin{prpstn}\label{langprops}\mbox{}\\
+\begin{proposition}\label{langprops}\mbox{}\\
 \begin{tabular}{@ {}lp{10cm}}
 (i)   & @{thm star_unfold_left}     \\
 (ii)  & @{thm[mode=IfThen] pow_length}\\
 (iii) & @{thm conc_Union_left} \\
 (iv)  & If @{thm (prem 1) star_decom} and @{thm (prem 2) star_decom} then
 there exists an @{text "x\<^isub>p"} and @{text "x\<^isub>s"} with @{text "x = x\<^isub>p @ x\<^isub>s"}
-and @{term "x\<^isub>p \<noteq> []"} such that @{term "x\<^isub>p \<in> A"} and @{term "x\<^isub>s \<in> A\<star>"}.
+and \mbox{@{term "x\<^isub>p \<noteq> []"}} such that @{term "x\<^isub>p \<in> A"} and @{term "x\<^isub>s \<in> A\<star>"}.
 \end{tabular}
-\end{prpstn}
+\end{proposition}
 \noindent
 In @{text "(ii)"} we use the notation @{term "length s"} for the length of a
 string; this property states that if \mbox{@{term "[] \<notin> A"}} then the lengths of
 the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}.
 involving equivalence classes of languages. For this we will use Arden's Lemma
 (see for example \cite[Page 100]{Sakarovitch09}),
 which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
 @{term "[] \<notin> A"}. However we will need the following `reversed'
 version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \<cdot> X"} to
-\mbox{@{term "X \<cdot> A"}}).\footnote{The details of the proof for the reversed Arden's Lemma
+\mbox{@{term "X \<cdot> A"}}).
-are given in the Appendix.}
+%\footnote{The details of the proof for the reversed Arden's Lemma
+%are given in the Appendix.}
-\begin{lmm}[Reversed Arden's Lemma]\label{arden}\mbox{}\\
+\begin{lemma}[Reversed Arden's Lemma]\label{arden}\mbox{}\\
 If @{thm (prem 1) reversed_Arden} then
 @{thm (lhs) reversed_Arden} if and only if
 @{thm (rhs) reversed_Arden}.
-\end{lmm}
+\end{lemma}
 \noindent
+The proof of this reversed version of Arden's lemma follows the proof of the
+original version.
 Regular expressions are defined as the inductive datatype
 \begin{center}
 \begin{tabular}{rcl}
 @{text r} & @{text "::="} & @{term ZERO}\\
 \end{tabular}
 \end{center}
 Given a finite set of regular expressions @{text rs}, we will make use of the operation of generating
 a regular expression that matches the union of all languages of @{text rs}.
-This definion is not trivial in a theorem prover, because @{text rs} (being a set) is unordered,
+This definition is not trivial in a theorem prover, because @{text rs} (being a set) is unordered,
 but the regular expression needs an order. Since
 we only need to know the
 existence
 of such a regular expression, we can use Isabelle/HOL's @{const "fold_graph"} and Hilbert's
 choice operator, written @{text "SOME"} in Isabelle/HOL, for defining @{term "\<Uplus>rs"}.
 section {* The Myhill-Nerode Theorem, First Part *}
 text {*
 \noindent
 The key definition in the Myhill-Nerode Theorem is the
-\emph{Myhill-Nerode Relation}, which states that w.r.t.~a language two
+\emph{Myhill-Nerode Relation}, which states that  two
-strings are related, provided there is no distinguishing extension in this
+strings are related w.r.t.~a language, provided there is no distinguishing extension in this
 language. This can be defined as a ternary relation.
-\begin{dfntn}[Myhill-Nerode Relation]\label{myhillneroderel}
+\begin{definition}[Myhill-Nerode Relation]\label{myhillneroderel}
 Given a language @{text A}, two strings @{text x} and
 @{text y} are Myhill-Nerode related provided
 \begin{center}
 @{thm str_eq_def'}
 \end{center}
-\end{dfntn}
+\end{definition}
 \noindent
 It is easy to see that @{term "\<approx>A"} is an equivalence relation, which
 partitions the set of all strings, @{text "UNIV"}, into a set of disjoint
 equivalence classes. To illustrate this quotient construction, let us give a simple
 One direction of the Myhill-Nerode Theorem establishes
 that if there are finitely many equivalence classes, like in the example above, then
 the language is regular. In our setting we therefore have to show:
-\begin{thrm}\label{myhillnerodeone}
+\begin{theorem}\label{myhillnerodeone}
 @{thm[mode=IfThen] Myhill_Nerode1}
-\end{thrm}
+\end{theorem}
 \noindent
 To prove this theorem, we first define the set @{term "finals A"} as those equivalence
-classes from @{term "UNIV // \<approx>A"} that contain strings of @{text A}, namely
+classes from \mbox{@{term "UNIV // \<approx>A"}} that contain strings of @{text A}, namely
 %
 \begin{equation}
 @{thm finals_def}
 \end{equation}
 \end{equation}
 \noindent
 which means that if we append the character @{text c} to the end of all
 strings in the equivalence class @{text Y}, we obtain a subset of
-@{text X}. Note that we do not define an automaton here, we merely relate two sets
+@{text X}.
-(with the help of a character). In our concrete example we have
+%Note that we do not define formally an automaton here,
+%we merely relate two sets
+%(with the help of a character).
+In our concrete example we have
 @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<^isub>i\<Rightarrow> X\<^isub>3"} with @{text "d\<^isub>i"} being any
 other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>c\<^isub>j\<Rightarrow> X\<^isub>3"} for any
 character @{text "c\<^isub>j"}.
 Next we construct an \emph{initial equational system} that
 @{text "X\<^isub>n"} & @{text "="} & @{text "(Y\<^isub>n\<^isub>1, ATOM c\<^isub>n\<^isub>1) + \<dots> + (Y\<^isub>n\<^isub>q, ATOM c\<^isub>n\<^isub>q)"}\\
 \end{tabular}
 \end{center}
 \noindent
-where the terms @{text "(Y\<^isub>i\<^isub>j, ATOM c\<^isub>i\<^isub>j)"} are pairs consiting of an equivalence class and
+where the terms @{text "(Y\<^isub>i\<^isub>j, ATOM c\<^isub>i\<^isub>j)"} are pairs consisting of an equivalence class and
 a regular expression. In the initial equational system, they
 stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow>
 X\<^isub>i"}.
 %The intuition behind the equational system is that every
 %equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system
 \noindent
 Because we use sets of terms
 for representing the right-hand sides of equations, we can
 prove \eqref{inv1} and \eqref{inv2} more concisely as
 %
-\begin{lmm}\label{inv}
+\begin{lemma}\label{inv}
 If @{thm (prem 1) test} then @{text "X = \<Union> \<calL> ` rhs"}.
-\end{lmm}
+\end{lemma}
 \noindent
 Our proof of Theorem~\ref{myhillnerodeone} will proceed by transforming the
 initial equational system into one in \emph{solved form} maintaining the invariant
 in Lemma~\ref{inv}. From the solved form we will be able to read
 \end{center}
 \noindent
 This allows us to prove a version of Arden's Lemma on the level of equations.
-\begin{lmm}\label{ardenable}
+\begin{lemma}\label{ardenable}
 Given an equation @{text "X = rhs"}.
 If @{text "X = \<Union>\<calL> ` rhs"},
 @{thm (prem 2) Arden_preserves_soundness}, and
 @{thm (prem 3) Arden_preserves_soundness}, then
 @{text "X = \<Union>\<calL> ` (Arden X rhs)"}.
-\end{lmm}
+\end{lemma}
 \noindent
 Our @{text ardenable} condition is slightly stronger than needed for applying Arden's Lemma,
 but we can still ensure that it holds throughout our algorithm of transforming equations
 into solved form.
 It is straightforward to prove that the initial equational system satisfies the
 invariant.
-\begin{lmm}\label{invzero}
+\begin{lemma}\label{invzero}
 @{thm[mode=IfThen] Init_ES_satisfies_invariant}
-\end{lmm}
+\end{lemma}
 \begin{proof}
 Finiteness is given by the assumption and the way how we set up the
 initial equational system. Soundness is proved in Lemma~\ref{inv}. Distinctness
 follows from the fact that the equivalence classes are disjoint. The @{text ardenable}
 \end{proof}
 \noindent
 Next we show that @{text Iter} preserves the invariant.
-\begin{lmm}\label{iterone} If
+\begin{lemma}\label{iterone} If
 @{thm (prem 1) iteration_step_invariant[where xrhs="rhs"]},
 @{thm (prem 2) iteration_step_invariant[where xrhs="rhs"]} and
 @{thm (prem 3) iteration_step_invariant[where xrhs="rhs"]}, then
 @{thm (concl) iteration_step_invariant[where xrhs="rhs"]}.
-\end{lmm}
+\end{lemma}
 \begin{proof}
 The argument boils down to choosing an equation @{text "Y = yrhs"} to be eliminated
 and to show that @{term "Subst_all (ES - {(Y, yrhs)}) Y (Arden Y yrhs)"}
 preserves the invariant.
 \end{proof}
 \noindent
 We also need the fact that @{text Iter} decreases the termination measure.
-\begin{lmm}\label{itertwo} If
+\begin{lemma}\label{itertwo} If
 @{thm (prem 1) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]},
 @{thm (prem 2) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]} and
 @{thm (prem 3) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}, then\\
 \mbox{@{thm (concl) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}}.
-\end{lmm}
+\end{lemma}
 \begin{proof}
 By assumption we know that @{text "ES"} is finite and has more than one element.
 Therefore there must be an element @{term "(Y, yrhs) \<in> ES"} with
 @{term "(Y, yrhs) \<noteq> (X, rhs)"}. Using the distinctness property we can infer
 \noindent
 This brings us to our property we want to establish for @{text Solve}.
-\begin{lmm}
+\begin{lemma}
 If @{thm (prem 1) Solve} and @{thm (prem 2) Solve} then there exists
 a @{text rhs} such that  @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"}
 and @{term "invariant {(X, rhs)}"}.
-\end{lmm}
+\end{lemma}
 \begin{proof}
 In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly
 stronger invariant since Lemma~\ref{iterone} and \ref{itertwo} have the precondition
 that @{term "(X, rhs) \<in> ES"} for some @{text rhs}. This precondition is needed
 Premise 2 is given by Lemma~\ref{iterone} and the fact that @{const Iter} might
 modify the @{text rhs} in the equation @{term "X = rhs"}, but does not remove it.
 Premise 3 of~\eqref{whileprinciple} is by Lemma~\ref{itertwo}. Now in premise 4
 we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"}
 and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"}
-does not holds. By the stronger invariant we know there exists such a @{text "rhs"}
+does not hold. By the stronger invariant we know there exists such a @{text "rhs"}
 with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality
 of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"},
 for which the invariant holds. This allows us to conclude that
 @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"} and @{term "invariant {(X, rhs)}"} hold,
 as needed.
 \noindent
 With this lemma in place we can show that for every equivalence class in @{term "UNIV // \<approx>A"}
 there exists a regular expression.
-\begin{lmm}\label{every_eqcl_has_reg}
+\begin{lemma}\label{every_eqcl_has_reg}
 @{thm[mode=IfThen] every_eqcl_has_reg}
-\end{lmm}
+\end{lemma}
 \begin{proof}
 By the preceding lemma, we know that there exists a @{text "rhs"} such
 that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"},
 and that the invariant holds for this equation. That means we
 text {*
 \noindent
 In this section we will give a proof for establishing the second
 part of the Myhill-Nerode Theorem. It can be formulated in our setting as follows:
-\begin{thrm}\label{myhillnerodetwo}
+\begin{theorem}\label{myhillnerodetwo}
 Given @{text "r"} is a regular expression, then @{thm Myhill_Nerode2}.
-\end{thrm}
+\end{theorem}
 \noindent
 The proof will be by induction on the structure of @{text r}. It turns out
 the base cases are straightforward.
 \end{proof}
 \noindent
 Much more interesting, however, are the inductive cases. They seem hard to be solved
 directly. The reader is invited to try.\footnote{The induction hypothesis is not strong enough
-to make any progress with the TIME and STAR cases.}
+to make any progress with the @{text TIMES} and @{text STAR} cases.}
 In order to see how our proof proceeds consider the following suggestive picture
 given by \citeN{Constable00}:
 %
 \begin{equation}\label{pics}
 The relation @{term "\<approx>(lang r)"} partitions the set of all strings, @{term UNIV}, into some
 equivalence classes. To show that there are only finitely many of them, it
 suffices to show in each induction step that another relation, say @{text
 R}, has finitely many equivalence classes and refines @{term "\<approx>(lang r)"}.
-\begin{dfntn}
+\begin{definition}
 A relation @{text "R\<^isub>1"} \emph{refines} @{text "R\<^isub>2"}
 provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}.
-\end{dfntn}
+\end{definition}
 \noindent
 For constructing @{text R}, we will rely on some \emph{tagging-functions}
 defined over strings. Given the inductive hypothesis, it will be easy to
 prove that the \emph{range} of these tagging-functions is finite. The range
 @{term "\<approx>(lang r)"}, which implies that @{term "UNIV // \<approx>(lang r)"} must
 also be finite.  We formally define the notion of a \emph{tagging-relation}
 as follows.
-\begin{dfntn}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
+\begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
 and @{text y} are \emph{tag-related} provided
 \begin{center}
 @{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y \<equiv> tag x = tag y"}\;.
 \end{center}
-\end{dfntn}
+\end{definition}
 In order to establish finiteness of a set @{text A}, we shall use the following powerful
 principle from Isabelle/HOL's library.
 %
 \noindent
 It states that if an image of a set under an injective function @{text f} (injective over this set)
 is finite, then the set @{text A} itself must be finite. We can use it to establish the following
 two lemmas.
-\begin{lmm}\label{finone}
+\begin{lemma}\label{finone}
 @{thm[mode=IfThen] finite_eq_tag_rel}
-\end{lmm}
+\end{lemma}
 \begin{proof}
 We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have
 @{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be
 finite by assumption. Now @{term "f ` (UNIV // =tag=)"} is a subset of @{text "range f"},
 turn means that the equivalence classes @{text X}
 and @{text Y} must be equal. Therefore \eqref{finiteimageD} allows us to conclude
 with @{thm (concl) finite_eq_tag_rel}.
 \end{proof}
-\begin{lmm}\label{fintwo}
+\begin{lemma}\label{fintwo}
 Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby
 @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}.
 If @{thm (prem 1) refined_partition_finite[where ?R1.0="R\<^isub>1" and ?R2.0="R\<^isub>2"]}
 then @{thm (concl) refined_partition_finite[where ?R1.0="R\<^isub>1" and ?R2.0="R\<^isub>2"]}.
-\end{lmm}
+\end{lemma}
 \begin{proof}
 We prove this lemma again using \eqref{finiteimageD}. This time we set @{text f} to
 be @{text "X \<mapsto>"}~@{term "{R\<^isub>1 `` {x} | x. x \<in> X}"}. It is easy to see that
 @{term "finite (f ` (UNIV // R\<^isub>2))"} because it is a subset of @{term "Pow (UNIV // R\<^isub>1)"},
 \noindent
 The @{const TIMES}-case is slightly more complicated. We first prove the
 following lemma, which will aid the proof about refinement.
-\begin{lmm}\label{refinement}
+\begin{lemma}\label{refinement}
 The relation @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} refines @{term "\<approx>A"}, provided for
 all strings @{text x}, @{text y} and @{text z} we have that \mbox{@{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y"}}
 and @{term "x @ z \<in> A"} imply @{text "y @ z \<in> A"}.
-\end{lmm}
+\end{lemma}
 \noindent
 We therefore can analyse how the strings @{text "x @ z"} are in the language
 @{text A} and then construct an appropriate tagging-function to infer that
 \noindent
 If we know that @{text "(x\<^isub>p, x\<^isub>s) \<in> Partitions x"}, we will
 refer to @{text "x\<^isub>p"} as the \emph{prefix} of the string @{text x},
 and respectively to @{text "x\<^isub>s"} as the \emph{suffix}.
+*}
+text {*
 Now assuming  @{term "x @ z \<in> A \<cdot> B"}, there are only two possible ways of how to `split'
 this string to be in @{term "A \<cdot> B"}:
 %
 \begin{center}
 \begin{tabular}{c}
 node[midway, below=0.5em]{@{term "x\<^isub>p \<in> A"}};
 \end{tikzpicture}}
 \end{tabular}
 \end{center}
 %
 \noindent
 Either @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B}
 (first picture) or there is a prefix of @{text x} in @{text A} and the rest is in @{text B}
 (second picture). In both cases we have to show that @{term "y @ z \<in> A \<cdot> B"}. The first case
-we will only go through if we know that  @{term "x \<approx>A y"} holds @{text "(*)"}. Because then
+we will only go through if we know that  @{term "x \<approx>A y"} holds @{text "("}@{text "*"}@{text ")"}.
+Because then
 we can infer from @{term "x @ z\<^isub>p \<in> A"} that @{term "y @ z\<^isub>p \<in> A"} holds for all @{text "z\<^isub>p"}.
 In the second case we only know that @{text "x\<^isub>p"} and @{text "x\<^isub>s"} is one possible partition
 of the string @{text x}. We have to know that both @{text "x\<^isub>p"} and the
 corresponding partition @{text "y\<^isub>p"} are in @{text "A"}, and that @{text "x\<^isub>s"} is `@{text B}-related'
-to @{text "y\<^isub>s"} @{text "(**)"}. From the latter fact we can infer that @{text "y\<^isub>s @ z \<in> B"}.
+to @{text "y\<^isub>s"} @{text "("}@{text "**"}@{text ")"}. From the latter fact we can infer that @{text "y\<^isub>s @ z \<in> B"}.
 This will solve the second case.
-Taking the two requirements, @{text "(*)"} and @{text "(**)"}, together we define the
+Taking the two requirements, @{text "("}@{text "*"}@{text ")"} and @{text "(**)"}, together we define the
 tagging-function in the @{const Times}-case as:
 \begin{center}
 @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\<equiv>"}~
 @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"}
 \noindent
 Note that we have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do
 not know anything about how the string @{term x} is partitioned.
 With this definition in place, let us prove the @{const "Times"}-case.
 \begin{proof}[@{const TIMES}-Case]
 If @{term "finite (UNIV // \<approx>A)"} and @{term "finite (UNIV // \<approx>B)"}
 then @{term "finite ((UNIV // \<approx>A) \<times> (Pow (UNIV // \<approx>B)))"} holds. The range of
 @{term "tag_Times A B"} is a subset of this product set, and therefore finite.
 modify Antimirov's proof by performing an induction on @{text s} where we
 generalise over all @{text r}. That means in the @{text "cons"}-case the
 induction hypothesis is
 \begin{center}
-@{text "(IH)"}\hspace{3mm}@{term "\<forall>r. Derivs s (lang r) = \<Union> lang ` (pderivs s r)"}
+@{text "(IH)"}\hspace{3mm}@{term "\<forall>r. Derivs s (lang r) = \<Union> (lang ` (pderivs s r))"}
 \end{center}
 \noindent
 With this we can establish
 \begin{center}
 \begin{tabular}{r@ {\hspace{1.5mm}}c@ {\hspace{1.5mm}}ll}
 @{term "Derivs (c # s) (lang r)"}
 & @{text "="} & @{term "Derivs s (Deriv c (lang r))"} & by def.\\
-& @{text "="} & @{term "Derivs s (\<Union> lang ` (pderiv c r))"} & by @{text "("}\ref{Derspders}@{text ".i)"}\\
+& @{text "="} & @{term "Derivs s (\<Union> (lang ` (pderiv c r)))"} & by @{text "("}\ref{Derspders}@{text ".i)"}\\
-& @{text "="} & @{term "\<Union> (Derivs s) ` (lang ` (pderiv c r))"} & by def.~of @{text "Ders"}\\
+& @{text "="} & @{term "\<Union> ((Derivs s) ` (lang ` (pderiv c r)))"} & by def.~of @{text "Ders"}\\
-& @{text "="} & @{term "\<Union> lang ` (\<Union> pderivs s ` (pderiv c r))"} & by IH\\
+& @{text "="} & @{term "\<Union> (lang ` (\<Union> (pderivs s ` (pderiv c r))))"} & by IH\\
-& @{text "="} & @{term "\<Union> lang ` (pderivs (c # s) r)"} & by def.\\
+& @{text "="} & @{term "\<Union> (lang ` (pderivs (c # s) r))"} & by def.\\
 \end{tabular}
 \end{center}
 \noindent
 Note that in order to apply the induction hypothesis in the fourth equation, we
 %
 \begin{equation}\label{Pdersdef}
 @{thm pderivs_lang_def}
 \end{equation}
-\begin{thrm}[\cite{Antimirov95}]\label{antimirov}
+\begin{theorem}[\cite{Antimirov95}]\label{antimirov}
 For every language @{text A} and every regular expression @{text r},
 \mbox{@{thm finite_pderivs_lang}}.
-\end{thrm}
+\end{theorem}
 \noindent
 Antimirov's proof first establishes this theorem for the language @{term UNIV1},
 which is the set of all non-empty strings. For this he proves:
 %
 that @{term "pderivs_lang B r"} is finite for every language @{text B} and
 regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition
 and \eqref{Derspders} we have
 %
 \begin{equation}\label{eqq}
-@{term "Deriv_lang B (lang r) = (\<Union> lang ` (pderivs_lang B r))"}
+@{term "Deriv_lang B (lang r) = (\<Union> (lang ` (pderivs_lang B r)))"}
 \end{equation}
 \noindent
 Since there are only finitely many regular expressions in @{term "pderivs_lang
 B r"}, we know by \eqref{uplus} that there exists a regular expression so that
 \end{center}
 \noindent
 We like to establish
-\begin{thrm}[\cite{Haines69}]\label{subseqreg}
+\begin{theorem}[\cite{Haines69}]\label{subseqreg}
 For every language @{text A}, the languages @{text "(i)"} @{term "SUBSEQ A"} and
 @{text "(ii)"} @{term "SUPSEQ A"}
 are regular.
-\end{thrm}
+\end{theorem}
 \noindent
 Our proof follows the one given by \citeN[Pages 92--95]{Shallit08}, except that we use
 Higman's Lemma, which is already proved in the Isabelle/HOL library by
 Sternagel.
 \noindent
 whereby the last equation follows from the fact that @{term "A\<star>"} contains the
 empty string. With these properties at our disposal we can establish the lemma
-\begin{lmm}
+\begin{lemma}
 If @{text A} is regular, then also @{term "SUPSEQ A"}.
-\end{lmm}
+\end{lemma}
 \begin{proof}
 Since our alphabet is finite, we have a regular expression, written @{text ALL}, that
 matches every string. Using this regular expression we can inductively define
 the operation @{text "r\<up>"}
 \end{proof}
 \noindent
 Now we can prove the main lemma w.r.t.~@{const "SUPSEQ"}, namely
-\begin{lmm}\label{mset}
+\begin{lemma}\label{mset}
 For every language @{text A}, there exists a finite language @{text M} such that
 \begin{center}
 \mbox{@{term "SUPSEQ M = SUPSEQ A"}}\;.
 \end{center}
-\end{lmm}
+\end{lemma}
 \begin{proof}
 For @{text M} we take the set of all minimal elements of @{text A}. An element @{text x}
 is said to be \emph{minimal} in @{text A} provided
 Finally we like to show that the Myhill-Nerode Theorem is also convenient for establishing
 the non-regularity of languages. For this we use the following version of the Continuation
 Lemma (see for example~\cite{Rosenberg06}).
-\begin{lmm}[Continuation Lemma]
+\begin{lemma}[Continuation Lemma]
 If a language @{text A} is regular and a set of strings @{text B} is infinite,
 then there exist two distinct strings @{text x} and @{text y} in @{text B}
 such that @{term "x \<approx>A y"}.
-\end{lmm}
+\end{lemma}
 \noindent
 This lemma can be easily deduced from the Myhill-Nerode Theorem and the Pigeonhole
 Principle: Since @{text A} is regular, there can be only finitely many
 equivalence classes. Hence an infinite set must contain
 Using this lemma, it is straightforward to establish that the language
 \mbox{@{text "A \<equiv> \<Union>\<^isub>n a\<^sup>n @ b\<^sup>n"}} is not regular (@{text "a\<^sup>n"} stands
 for the strings consisting of @{text n} times the character a; similarly for
 @{text "b\<^isup>n"}). For this consider the infinite set @{text "B \<equiv> \<Union>\<^isub>n a\<^sup>n"}.
-\begin{lmm}
+\begin{lemma}
 No two distinct strings in set @{text "B"} are Myhill-Nerode related by language @{text A}.
-\end{lmm}
+\end{lemma}
 \begin{proof}
 After unfolding the definition of @{text "B"}, we need to establish that given @{term "i \<noteq> j"},
 the strings @{text "a\<^sup>i"} and @{text "a\<^sup>j"} are not Myhill-Nerode related by @{text "A"}.
 That means we have to show that @{text "\<forall>z. a\<^sup>i @ z \<in> A = a\<^sup>j @ z \<in> A"} leads to
 \noindent
 We have established in Isabelle/HOL both directions
 of the Myhill-Nerode Theorem.
 %
-\begin{thrm}[Myhill-Nerode Theorem]\mbox{}\\
+\begin{theorem}[Myhill-Nerode Theorem]\mbox{}\\
 A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
-\end{thrm}
+\end{theorem}
 \noindent
 Having formalised this theorem means we pushed our point of view quite
 far. Using this theorem we can obviously prove when a language is \emph{not}
 regular---by establishing that it has infinitely many equivalence classes
 Theorem~\ref{antimirov}). However, since partial derivatives use sets of
 regular expressions, one needs to carefully analyse whether the resulting
 algorithm is still executable. Given the infrastructure for
 executable sets  introduced by \citeN{Haftmann09} in Isabelle/HOL, it should.
-We started out by claiming that in a theorem prover it is eaiser to
+We started out by claiming that in a theorem prover it is easier to
-reason about regular expressions than about automta. Here are some
+reason about regular expressions than about automata. Here are some
 numbers: Our formalisation of the Myhill-Nerode Theorem consists of
 780 lines of Isabelle/Isar code for the first direction and 460 for
 the second (the one based on tagging-functions), plus around 300
 lines of standard material about regular languages. The
 formalisation of derivatives and partial derivatives shown in
 \noindent
 {\bf Acknowledgements:}
 We are grateful for the comments we received from Larry Paulson.  Tobias
 Nipkow made us aware of the properties in Theorem~\ref{subseqreg} and Tjark
 Weber helped us with proving them. Christian Sternagel provided us with a
-version of Higman's Lemma that applies to arbtrary, but finite alphabets.
+version of Higman's Lemma that applies to arbitrary, but finite alphabets.
 \bibliographystyle{acmtrans}
 \bibliography{root}
-\newpage
+%\newpage
-\begin{appendix}
+%\begin{appendix}
-\section{Appendix$^\star$}
+%\section{Appendix$^\star$}
-\renewcommand{\thefootnote}{\mbox{$\star$}}
+%\renewcommand{\thefootnote}{\mbox{$\star$}}
-\footnotetext{If the reviewers deem more suitable, the authors are
+%\footnotetext{If the reviewers deem more suitable, the authors are
-prepared to drop material or move it to an electronic appendix.}
+%prepared to drop material or move it to an electronic appendix.}
-\begin{proof}[of Lemma~\ref{arden}]
+%\begin{proof}[of Lemma~\ref{arden}]
-For the right-to-left direction we assume @{thm (rhs) reversed_Arden} and show
+%For the right-to-left direction we assume @{thm (rhs) reversed_Arden} and show
-that @{thm (lhs) reversed_Arden} holds. From Property~\ref{langprops}@{text "(i)"}
+%that @{thm (lhs) reversed_Arden} holds. From Property~\ref{langprops}@{text "(i)"}
-we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"},
+%we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"},
-which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both
+%which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both
-sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side
+%sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side
-is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. Applying the assumed equation
+%is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. Applying the assumed equation
-completes this direction.
+%completes this direction.
-For the other direction we assume @{thm (lhs) reversed_Arden}. By a simple induction
+%For the other direction we assume @{thm (lhs) reversed_Arden}. By a simple induction
-on @{text n}, we can establish the property
+%on @{text n}, we can establish the property
-\begin{center}
+%\begin{center}
-@{text "(*)"}\hspace{5mm} @{thm (concl) reversed_arden_helper}
+%@{text "("}@{text "*"}@{text ")"}\hspace{5mm} @{thm (concl) reversed_arden_helper}
-\end{center}
+%\end{center}
-\noindent
+%\noindent
-Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for
+%Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for
-all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition
+%all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition
-of @{text "\<star>"}.
+%of @{text "\<star>"}.
-For the inclusion in the other direction we assume a string @{text s}
+%For the inclusion in the other direction we assume a string @{text s}
-with length @{text k} is an element in @{text X}. Since @{thm (prem 1) reversed_Arden}
+%with length @{text k} is an element in @{text X}. Since @{thm (prem 1) reversed_Arden}
-we know by Property~\ref{langprops}@{text "(ii)"} that
+%we know by Property~\ref{langprops}@{text "(ii)"} that
-@{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k}
+%@{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k}
-(the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer).
+%(the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer).
-From @{text "(*)"} it follows then that
+%From @{text "("}@{text "*"}@{text ")"} it follows then that
-@{term s} must be an element in @{term "(\<Union>m\<le>k. B \<cdot> (A \<up> m))"}. This in turn
+%@{term s} must be an element in @{term "(\<Union>m\<le>k. B \<cdot> (A \<up> m))"}. This in turn
-implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Property~\ref{langprops}@{text "(iii)"}
+%implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Property~\ref{langprops}@{text "(iii)"}
-this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show.
+%this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show.
-\end{proof}
+%\end{proof}
-\end{appendix}
+% \end{appendix}
 *}
 (*<*)
 end

changeset 379	8c4b6fb43ebe
parent 378	a0bcf886b8ef
child 381	99161cd17c0f