regexp: comparison Journal/Paper.thy

equal deleted inserted replaced

-:c028b07a4fa8
+:99161cd17c0f
 "\<lbrakk>x\<rbrakk>\<^isub>\<approx> = \<lbrakk>y\<rbrakk>\<^isub>\<approx>"}.
 Central to our proof will be the solution of equational systems
 involving equivalence classes of languages. For this we will use Arden's Lemma
-(see for example \cite[Page 100]{Sakarovitch09}),
+(see for example \citet[Page 100]{Sakarovitch09}),
 which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
 @{term "[] \<notin> A"}. However we will need the following `reversed'
 version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \<cdot> X"} to
 \mbox{@{term "X \<cdot> A"}}).
 %\footnote{The details of the proof for the reversed Arden's Lemma
 \noindent
 It is easy to see that @{term "\<approx>A"} is an equivalence relation, which
 partitions the set of all strings, @{text "UNIV"}, into a set of disjoint
 equivalence classes. To illustrate this quotient construction, let us give a simple
-example: consider the regular language containing just
+example: consider the regular language built up over the alphabet @{term "{a, b}"} and
-the string @{text "[c]"}. The relation @{term "\<approx>({[c]})"} partitions @{text UNIV}
+containing just the string two strings @{text "[a]"} and @{text "[a, b]"}. The
-into three equivalence classes @{text "X\<^isub>1"}, @{text "X\<^isub>2"} and  @{text "X\<^isub>3"}
+relation @{term "\<approx>({[a], [a, b]})"} partitions @{text UNIV}
+into four equivalence classes @{text "X\<^isub>1"}, @{text "X\<^isub>2"}, @{text "X\<^isub>3"} and @{text "X\<^isub>4"}
 as follows
 \begin{center}
 \begin{tabular}{l}
 @{text "X\<^isub>1 = {[]}"}\\
-@{text "X\<^isub>2 = {[c]}"}\\
+@{text "X\<^isub>2 = {[a]}"}\\
-@{text "X\<^isub>3 = UNIV - {[], [c]}"}
+@{text "X\<^isub>3 = {[a, b]}"}\\
+@{text "X\<^isub>4 = UNIV - {[], [a], [a, b]}"}
 \end{tabular}
 \end{center}
 One direction of the Myhill-Nerode Theorem establishes
 that if there are finitely many equivalence classes, like in the example above, then
 \begin{equation}
 @{thm finals_def}
 \end{equation}
 \noindent
-In our running example, @{text "X\<^isub>2"} is the only
+In our running example, @{text "X\<^isub>2"} and @{text "X\<^isub>3"} are the only
-equivalence class in @{term "finals {[c]}"}.
+equivalence classes in @{term "finals {[a], [a, b]}"}.
 It is straightforward to show that in general
 %
 \begin{equation}\label{finalprops}
 @{thm lang_is_union_of_finals}\hspace{15mm}
 @{thm finals_in_partitions}
 strings in the equivalence class @{text Y}, we obtain a subset of
 @{text X}.
 %Note that we do not define formally an automaton here,
 %we merely relate two sets
 %(with the help of a character).
 In our concrete example we have
-@{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<^isub>i\<Rightarrow> X\<^isub>3"} with @{text "d\<^isub>i"} being any
-other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>c\<^isub>j\<Rightarrow> X\<^isub>3"} for any
+\begin{center}
-character @{text "c\<^isub>j"}.
+\begin{tabular}{l}
+@{term "X\<^isub>1 \<Turnstile>a\<Rightarrow> X\<^isub>2"},\; @{term "X\<^isub>1 \<Turnstile>b\<Rightarrow> X\<^isub>4"};\\
+@{term "X\<^isub>2 \<Turnstile>b\<Rightarrow> X\<^isub>3"},\; @{term "X\<^isub>2 \<Turnstile>a\<Rightarrow> X\<^isub>4"};\\
+@{term "X\<^isub>3 \<Turnstile>a\<Rightarrow> X\<^isub>4"},\; @{term "X\<^isub>3 \<Turnstile>b\<Rightarrow> X\<^isub>4"} and\\
+@{term "X\<^isub>4 \<Turnstile>a\<Rightarrow> X\<^isub>4"},\; @{term "X\<^isub>4 \<Turnstile>b\<Rightarrow> X\<^isub>4"}.
+\end{tabular}
+\end{center}
 Next we construct an \emph{initial equational system} that
 contains an equation for each equivalence class. We first give
 an informal description of this construction. Suppose we have
 the equivalence classes @{text "X\<^isub>1,\<dots>,X\<^isub>n"}, there must be one and only one that
 In our running example we have the initial equational system
 %
 \begin{equation}\label{exmpcs}
 \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
 @{term "X\<^isub>1"} & @{text "="} & @{text "\<lambda>(ONE)"}\\
-@{term "X\<^isub>2"} & @{text "="} & @{text "(X\<^isub>1, ATOM c)"}\\
+@{term "X\<^isub>2"} & @{text "="} & @{text "(X\<^isub>1, ATOM a)"}\\
-@{term "X\<^isub>3"} & @{text "="} & @{text "(X\<^isub>1, ATOM d\<^isub>1) + \<dots> + (X\<^isub>1, ATOM d\<^isub>n)"}\\
+@{term "X\<^isub>3"} & @{text "="} & @{text "(X\<^isub>2, ATOM b)"}\\
-& & \mbox{}\hspace{10mm}@{text "+ (X\<^isub>3, ATOM c\<^isub>1) + \<dots> + (X\<^isub>3, ATOM c\<^isub>m)"}
+@{term "X\<^isub>4"} & @{text "="} & @{text "(X\<^isub>1, ATOM b) + (X\<^isub>2, ATOM a) + (X\<^isub>3, ATOM a)"}\\
+& & \mbox{}\hspace{0mm}@{text "+ (X\<^isub>3, ATOM b) + (X\<^isub>4, ATOM a) + (X\<^isub>4, ATOM b)"}\\
 \end{tabular}}
 \end{equation}
 \noindent
-where @{text "d\<^isub>1\<dots>d\<^isub>n"} is the sequence of all characters
-but not containing @{text c}, and @{text "c\<^isub>1\<dots>c\<^isub>m"} is the sequence of all
-characters.
 Overloading the function @{text \<calL>} for the two kinds of terms in the
 equational system, we have
 \begin{center}
 @{text "\<calL>(Y, r) \<equiv>"} %
 \noindent
 In this definition, we first delete all terms of the form @{text "(X, r)"} from @{text rhs};
 then we calculate the combined regular expressions for all @{text r} coming
 from the deleted @{text "(X, r)"}, and take the @{const Star} of it;
 finally we append this regular expression to @{text rhs'}. If we apply this
-operation to the right-hand side of @{text "X\<^isub>3"} in \eqref{exmpcs}, we obtain
+operation to the right-hand side of @{text "X\<^isub>4"} in \eqref{exmpcs}, we obtain
 the equation:
 \begin{center}
 \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
-@{term "X\<^isub>3"} & @{text "="} &
+@{term "X\<^isub>4"} & @{text "="} &
-@{text "(X\<^isub>1, TIMES (ATOM d\<^isub>1) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\<dots>, ATOM c\<^isub>m})) + \<dots> "}\\
+@{text "(X\<^isub>1, TIMES (ATOM b) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b})) +"}\\
-& & \mbox{}\hspace{13mm}
+& & @{text "(X\<^isub>2, TIMES (ATOM a) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b})) +"}\\
-@{text "\<dots> + (X\<^isub>1, TIMES (ATOM d\<^isub>n) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\<dots>, ATOM c\<^isub>m}))"}
+& & @{text "(X\<^isub>3, TIMES (ATOM a) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b})) +"}\\
+& & @{text "(X\<^isub>3, TIMES (ATOM b) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b}))"}
 \end{tabular}
 \end{center}
 \noindent
-That means we eliminated the recursive occurrence of @{text "X\<^isub>3"} on the
+That means we eliminated the recursive occurrence of @{text "X\<^isub>4"} on the
-right-hand side.  Note we used the abbreviation
+right-hand side.
-@{text "\<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\<dots>, ATOM c\<^isub>m}"}
-to stand for a regular expression that matches with every character. In
-our algorithm we are only interested in the existence of such a regular expression
-and do not specify it any further.
 It can be easily seen that the @{text "Arden"}-operation mimics Arden's
 Lemma on the level of equations. To ensure the non-emptiness condition of
 Arden's Lemma we say that a right-hand side is @{text ardenable} provided
 \eqref{exmpcs} into the right-hand side of the second, thus eliminating the equivalence
 class @{text "X\<^isub>1"}, gives us the equation
 %
 \begin{equation}\label{exmpresult}
 \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
-@{term "X\<^isub>2"} & @{text "="} & @{text "\<lambda>(TIMES ONE (ATOM c))"}
+@{term "X\<^isub>2"} & @{text "="} & @{text "\<lambda>(TIMES ONE (ATOM a))"}
 \end{tabular}}
 \end{equation}
 With these two operations in place, we can define the operation that removes one equation
 from an equational systems @{text ES}. The operation @{const Subst_all}
 Lemma~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
 of the Myhill-Nerode Theorem.
 \begin{proof}[of Theorem~\ref{myhillnerodeone}]
 By Lemma~\ref{every_eqcl_has_reg} we know that there exists a regular expression for
-every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is
+every equivalence class in @{term "UNIV // \<approx>A"}:
+\[
+\mbox{if}\;@{term "X \<in> (UNIV // \<approx>A)"}\;\mbox{then there exists a}\;
+@{text "r"}\;\mbox{such that}\;@{term "X = lang r"}
+\]\smallskip
+\noindent
+Since @{text "finals A"} is
 a subset of  @{term "UNIV // \<approx>A"}, we also know that for every equivalence class
 in @{term "finals A"} there exists a regular expression. Moreover by assumption
 we know that @{term "finals A"} must be finite, and therefore there must be a finite
 set of regular expressions @{text "rs"} such that
 @{term "\<Union>(finals A) = lang (\<Uplus>rs)"}.
 Since the left-hand side is equal to @{text A}, we can use @{term "\<Uplus>rs"}
 as the regular expression that is needed in the theorem.
 \end{proof}
 \noindent
+Solving the equational system of our running example gives as solution for the
+two final equivalence classes:
+\begin{center}
+\begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
+@{text "X\<^isub>2"} & @{text "="} & @{text "TIMES ONE (ATOM a)"}\\
+@{text "X\<^isub>3"} & @{text "="} & @{text "TIMES (TIMES ONE (ATOM a)) (ATOM b)"}
+\end{tabular}
+\end{center}
+\noindent
+Combining them with $\bigplus$ gives us a regular expression for the language @{text "{[a], [a, b]}"}.
 Note that our algorithm for solving equational systems provides also a method for
 calculating a regular expression for the complement of a regular language:
 if we combine all regular
-expressions corresponding to equivalence classes not in @{term "finals A"},
+expressions corresponding to equivalence classes not in @{term "finals A"}
+(in the running example @{text "X\<^isub>1"} and @{text "X\<^isub>4"}),
 then we obtain a regular expression for the complement language @{term "- A"}.
 This is similar to the usual construction of a `complement automaton'.
 *}
 section {* Myhill-Nerode, Second Part *}
 text {*
 provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}.
 \end{definition}
 \noindent
 For constructing @{text R}, we will rely on some \emph{tagging-functions}
-defined over strings. Given the inductive hypothesis, it will be easy to
+defined over strings, see Fig.~\ref{tagfig}. They are parameterised by sets
+of strings @{text A} and @{text B} standing for the induction hypotheses.
+Given the inductive hypotheses, it will be easy to
 prove that the \emph{range} of these tagging-functions is finite. The range
 of a function @{text f} is defined as
 \begin{center}
 @{text "range f \<equiv> f ` UNIV"}
 \end{center}
 \noindent
 \noindent
 It states that if an image of a set under an injective function @{text f} (injective over this set)
 is finite, then the set @{text A} itself must be finite. We can use it to establish the following
 two lemmas.
+\begin{figure}[t]
+\normalsize
+\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
+@{thm (lhs) tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]} &
+@{text "\<equiv>"} &
+@{thm (rhs) tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]} \\
+@{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}\;@{text "x"} & @{text "\<equiv>"} &
+@{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"}\\
+@{thm (lhs) tag_Star_def[where ?A="A", THEN meta_eq_app]} & @{text "\<equiv>"} &
+@{text "{\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>A\<^esub> | x\<^isub>p < x \<and> x\<^isub>p \<in> A\<^isup>\<star> \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x}"}
+\end{tabular}
+\caption{Three tagging functions used the cases for @{term "PLUS"}, @{term "TIMES"} and @{term "STAR"}
+regular expressions. The sets @{text A} and @{text B} are arbitrary languages instantiated
+in the proof with the induction hypotheses. \emph{Partitions} is a function
+that generates all possible partitions of a string.\label{tagfig}}
+\end{figure}
 \begin{lemma}\label{finone}
 @{thm[mode=IfThen] finite_eq_tag_rel}
 \end{lemma}
 \noindent
 Chaining Lemma~\ref{finone} and \ref{fintwo} together, means in order to show
 that @{term "UNIV // \<approx>(lang r)"} is finite, we have to construct a tagging-function whose
 range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(lang r)"}.
-Let us attempt the @{const PLUS}-case first. We take as tagging-function
+Let us attempt the @{const PLUS}-case first. We take as tagging-function from Fig.~\ref{tagfig}
 \begin{center}
 @{thm tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]}
 \end{center}
 This will solve the second case.
 Taking the two requirements, @{text "("}@{text "*"}@{text ")"} and @{text "(**)"}, together we define the
 tagging-function in the @{const Times}-case as:
 \begin{center}
-@{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\<equiv>"}~
+@{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}\;@{text "x"}~@{text "\<equiv>"}~
 @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"}
 \end{center}
 \noindent
 Note that we have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do
 "y\<^isub>s @ z\<^isub>a \<in> A"}. We also know that @{term "z\<^isub>b \<in> A\<star>"}.
 Therefore @{term "y\<^isub>p @ (y\<^isub>s @ z\<^isub>a) @ z\<^isub>b \<in>
 A\<star>"}, which means @{term "y @ z \<in> A\<star>"}. The last step is to set
 @{text "A"} to @{term "lang r"} and thus complete the proof.
 \end{proof}
+\begin{rmk}
+While our proof using tagging functions might seem like a rabbit pulled
+out of a hat, the intuition behind can be rationalised taking the
+view that the equivalence classes @{term "UNIV // \<approx>(lang r)"} stand for the
+states of the minimal automaton for the language @{term "lang r"}. The theorem
+amounts to showing that this minimal automaton has finitely many states.
+However, by showing that our @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} relation
+refines @{term "\<approx>A"} we do not actually have to show that the minimal automata
+has finitely many states, but only that we can show finiteness for an
+automaton with at least as many states and which can recognise the same
+language. By performing the induction over the regular expression @{text r},
+this means we have to construct inductively an automaton in
+the cases for @{term PLUS}, @{term TIMES} and @{term STAR}.
+In the @{text PLUS}-case, we can assume finitely many equivalence classes of the form
+@{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"} and @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>B\<^esub>"}, each standing for an automaton recognising the
+languages @{text A} and @{text B} respectively.
+The @{text "+tag"} function constructs pairs of the form @{text "(\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>, \<lbrakk>_\<rbrakk>\<^bsub>\<approx>B\<^esub>)"}
+which can be seen as the states of an automaton recognising the language
+\mbox{@{term "A \<union> B"}}. This is the usual product construction of automata:
+Given a string @{text x}, the first automata will be in the state  @{text "\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>"}
+after recognising @{text x} (similarly @{text "\<lbrakk>x\<rbrakk>\<^bsub>\<approx>B\<^esub>"} for the other automaton). The product
+automaton will be in the state \mbox{@{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, \<lbrakk>x\<rbrakk>\<^bsub>\<approx>B\<^esub>)"}}, which is accepting
+if at least one component is an accepting state.
+The @{text "TIMES"}-case is a bit more complicated---essentially we
+need to sequentially compose the two automata with the states @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"},
+respectively @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>B\<^esub>"}. We achieve this sequential composition by
+taking the first automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"} and append on each of its accepting
+state the automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>B\<^esub>"}. Unfortunately, this does not lead directly
+to a correct composition, since the automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"} might have consumed
+some of the input needed for the automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>B\<^esub>"} to succeed. The
+textbook construction for sequentially composing two automata circumvents this
+problem by connecting the terminating states of the first automaton via
+an epsilon-transition to the initial state of the second automaton (see for
+example \citeN{Kozen97}). In the absence of any epsilon-transition analogue in our work,
+we let the second automaton
+start in all possible states that may have been reached if the first
+automaton had not consumed the input meant for the second. We achieve this
+by having a set of states as the second component generated by our
+@{text "\<times>tag"} function (see second clause in Fig.~\ref{tagfig}).
+A state of this sequentially composed automaton is accepting, if the first
+component is accepting and at least one state in the set is also accepting.
+The idea behind the @{text "STAR"}-case is similar to the @{text "TIMES"}-case.
+We assume some automaton has consumed some strictly smaller part of the input;
+we need to check that from the state we ended up in a terminal state in the
+automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"}. Since we do not know from which state this will
+succeed, we need to run the automaton from all possible states we could have
+ended up in. Therefore the @{text "\<star>tag"} function generates a set of states.
+\end{rmk}
 *}
 section {* Second Part proved using Partial Derivatives *}
 text {*
 The first step in our proof of Theorem~\ref{subseqreg} is to establish the
 following simple properties for @{term SUPSEQ}
 %
 \begin{equation}\label{supseqprops}
 \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
-@{thm (lhs) SUPSEQ_simps(1)} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_simps(1)}\\
+@{thm (lhs) SUPSEQ_simps(1)} & @{text "="} & @{thm (rhs) SUPSEQ_simps(1)}\\
-@{thm (lhs) SUPSEQ_simps(2)} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_simps(2)}\\
+@{thm (lhs) SUPSEQ_simps(2)} & @{text "="} & @{thm (rhs) SUPSEQ_simps(2)}\\
-@{thm (lhs) SUPSEQ_atom} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_atom}\\
+@{thm (lhs) SUPSEQ_atom} & @{text "="} & @{thm (rhs) SUPSEQ_atom}\\
-@{thm (lhs) SUPSEQ_union} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_union}\\
+@{thm (lhs) SUPSEQ_union} & @{text "="} & @{thm (rhs) SUPSEQ_union}\\
-@{thm (lhs) SUPSEQ_conc} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_conc}\\
+@{thm (lhs) SUPSEQ_conc} & @{text "="} & @{thm (rhs) SUPSEQ_conc}\\
-@{thm (lhs) SUPSEQ_star} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_star}
+@{thm (lhs) SUPSEQ_star} & @{text "="} & @{thm (rhs) SUPSEQ_star}
 \end{tabular}}
 \end{equation}
 \noindent
 whereby the last equation follows from the fact that @{term "A\<star>"} contains the
 \noindent
 In this paper we took the view that a regular language is one where there
 exists a regular expression that matches all of its strings. Regular
 expressions can conveniently be defined as a datatype in theorem
 provers. For us it was therefore interesting to find out how far we can push
-this point of view. But this question whether formal language theory can
+this point of view. But this question whether regular language theory can
-be done without automata crops up also in non-theorem prover contexts. For
+be done without automata crops up also in non-theorem prover contexts. Recall
-example, Gasarch asked in the Computational Complexity blog by \citeN{GasarchBlog}
+Gasarch's speculation cited in the Introduction.
-whether the complementation of
-regular-expression languages can be proved without using automata. He concluded
-\begin{quote}
-{\it \ldots it can't be done}
-\end{quote}
-\noindent
-and even pondered
-\begin{quote}
-{\it \ldots [b]ut is there a rigorous way to even state that?}
-\end{quote}
-\noindent
 We have established in Isabelle/HOL both directions
 of the Myhill-Nerode Theorem.
 %
 \begin{theorem}[Myhill-Nerode Theorem]\mbox{}\\
 A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
 regular---by establishing that it has infinitely many equivalence classes
 generated by the Myhill-Nerode Relation (this is usually the purpose of the
 Pumping Lemma).  We can also use it to establish the standard
 textbook results about closure properties of regular languages. The case of
 closure under complement follows easily from the Myhill-Nerode Theorem.
-So our answer to Gasarch is ``{\it it can be done!''}
+So our answer to Gasarch is ``{\it yes we can!''}
 %Our insistence on regular expressions for proving the Myhill-Nerode Theorem
 %arose from the problem of defining formally the regularity of a language.
 %In order to guarantee consistency,
 %formalisations in HOL can only extend the logic with definitions that introduce a new concept in
 state', while Brzozowski has it on the terminal states. This means we also
 need to reverse the direction of Arden's Lemma. We have not found anything
 in the `pencil-and-paper-reasoning' literature about our way of proving the
 first direction of the Myhill-Nerode Theorem, but it appears to be folklore.
-We presented two proofs for the second direction of the Myhill-Nerode
+We presented two proofs for the second direction of the
-Theorem. One direct proof using tagging-functions and another using partial
+Myhill-Nerode Theorem. One direct proof using tagging-functions and
-derivatives. This part of our work is where our method using regular
+another using partial derivatives. This part of our work is where
-expressions shines, because we can completely side-step the standard
+our method using regular expressions shines, because we can perform
-argument (for example used by \citeN{Kozen97}) where automata need to be composed. However, it is
+an induction on the structure of refular expressions. However, it is
-also the direction where we had to spend most of the `conceptual' time, as
+also the direction where we had to spend most of the `conceptual'
-our first proof based on tagging-functions is new for establishing the
+time, as our first proof based on tagging-functions is new for
-Myhill-Nerode Theorem. All standard proofs of this direction proceed by
+establishing the Myhill-Nerode Theorem. All standard proofs of this
-arguments over automata.
+direction proceed by arguments over automata.
 The indirect proof for the second direction arose from our interest in
 Brzozowski's derivatives for regular expression matching.  While \citeN{Brzozowski64}
 already established that there are only
 finitely many dissimilar derivatives for every regular expression, this

changeset 381	99161cd17c0f
parent 379	8c4b6fb43ebe
child 383	c8cb6914f4c8