regexp: comparison Paper/Paper.thy

equal deleted inserted replaced

-:1589bf5c1ad8
+:63bc9f9d96ba
 different approach to regular languages. Instead of defining a regular language as one
 where there exists an automaton that recognises all strings of the language, we define
 a regular language as:
 \begin{definition}[A Regular Language]
-A language @{text A} is regular, provided there is a regular expression that matches all
+A language @{text A} is \emph{regular}, provided there is a regular expression that matches all
 strings of @{text "A"}.
 \end{definition}
 \noindent
 The reason is that regular expressions, unlike graphs and matrices, can
 @{text "\<lbrakk>x\<rbrakk>\<^isub>\<approx>"} for the equivalence class defined
 as @{text "{y | y \<approx> x}"}.
 Central to our proof will be the solution of equational systems
-involving regular expressions. For this we will use Arden's lemma \cite{Brzozowski64}
+involving sets of languages. For this we will use Arden's lemma \cite{Brzozowski64}
 which solves equations of the form @{term "X = A ;; X \<union> B"} provided
 @{term "[] \<notin> A"}. However we will need the following ``reverse''
 version of Arden's lemma.
 \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
 *}
 section {* Finite Partitions Imply Regularity of a Language *}
 text {*
-The central definition in the Myhill-Nerode theorem is the
+The key definition in the Myhill-Nerode theorem is the
 \emph{Myhill-Nerode relation}, which states that w.r.t.~a language two
 strings are related, provided there is no distinguishing extension in this
 language. This can be defined as:
 \begin{definition}[Myhill-Nerode Relation]\mbox{}\\
 \noindent
 It is easy to see that @{term "\<approx>A"} is an equivalence relation, which
 partitions the set of all strings, @{text "UNIV"}, into a set of disjoint
 equivalence classes. One direction of the Myhill-Nerode theorem establishes
 that if there are finitely many equivalence classes, then the language is
-regular. In our setting we have therefore
+regular. In our setting we therefore have to show:
 \begin{theorem}\label{myhillnerodeone}
 @{thm[mode=IfThen] hard_direction}
 \end{theorem}
 \begin{equation}
 @{thm finals_def}
 \end{equation}
 \noindent
-It is straightforward to show that @{thm lang_is_union_of_finals} holds.
+It is straightforward to show that @{thm lang_is_union_of_finals} and
+@{thm finals_included_in_UNIV} hold.
 Therefore if we know that there exists a regular expression for every
 equivalence class in @{term "finals A"} (which by assumption must be
-a finite set), then we can just combine these regular expressions with @{const ALT}
+a finite set), then we can combine these regular expressions with @{const ALT}
 and obtain a regular expression that matches every string in @{text A}.
-We prove Thm.~\ref{myhillnerodeone} by calculating a regular expression for
+We prove Thm.~\ref{myhillnerodeone} by giving a method that can calculate a
-\emph{all} equivalence classes, not just the ones in @{term "finals A"}. We
+regular expression for \emph{every} equivalence classe, not just the ones
+in @{term "finals A"}. We
 first define a notion of \emph{transition} between equivalence classes
 %
 \begin{equation}
 @{thm transition_def}
 \end{equation}
 \noindent
 which means that if we concatenate all strings matching the regular expression @{text r}
 to the end of all strings in the equivalence class @{text Y}, we obtain a subset of
-@{text X}. Note that we do not define any automaton here, we merely relate two sets
+@{text X}. Note that we do not define an automaton here, we merely relate two sets
-w.r.t.~a regular expression.
+(w.r.t.~a regular expression).
 Next we build an equational system that
 contains an equation for each equivalence class. Suppose we have
 the equivalence classes @{text "X\<^isub>1,\<dots>,X\<^isub>n"}, there must be one and only one that
 contains the empty string @{text "[]"} (since equivalence classes are disjoint).
-Let us assume @{text "[] \<in> X\<^isub>1"}. We can build the following equational system
+Let us assume @{text "[] \<in> X\<^isub>1"}. We build the following equational system
 \begin{center}
 \begin{tabular}{rcl}
 @{text "X\<^isub>1"} & @{text "="} & @{text "(Y\<^isub>1\<^isub>1, CHAR c\<^isub>1\<^isub>1) + \<dots> + (Y\<^isub>1\<^isub>p, CHAR c\<^isub>1\<^isub>p) + \<lambda>(EMPTY)"} \\
 @{text "X\<^isub>2"} & @{text "="} & @{text "(Y\<^isub>2\<^isub>1, CHAR c\<^isub>2\<^isub>1) + \<dots> + (Y\<^isub>2\<^isub>o, CHAR c\<^isub>2\<^isub>o)"} \\
 \noindent
 where the pairs @{text "(Y\<^isub>i\<^isub>j, r\<^isub>i\<^isub>j)"} stand for all transitions
 @{term "Y\<^isub>i\<^isub>j \<Turnstile>r\<^isub>i\<^isub>j\<Rightarrow> X\<^isub>i"}.  The term @{text "\<lambda>(EMPTY)"} acts as a marker for the equivalence
 class containing @{text "[]"}. (Note that we mark, roughly speaking, the
 single ``initial'' state in the equational system, which is different from
-the method by Brzozowski \cite{Brzozowski64}, which marks the ``terminal''
+the method by Brzozowski \cite{Brzozowski64}, since for his purposes he needs to mark
-states.) Overloading the function @{text L} for the two kinds of terms in the
+the ``terminal'' states.) Overloading the function @{text L} for the two kinds of terms in the
 equational system as follows
 \begin{center}
-@{thm L_rhs_e.simps(1)[where r="r", THEN eq_reflection]}\hspace{20mm}
+@{thm L_rhs_e.simps(2)[where X="Y" and r="r", THEN eq_reflection]}\hspace{10mm}
-@{thm L_rhs_e.simps(2)[where X="X" and r="r", THEN eq_reflection]}
+@{thm L_rhs_e.simps(1)[where r="r", THEN eq_reflection]}
 \end{center}
 \noindent
 we can prove for @{text "X\<^isub>2\<^isub>.\<^isub>.\<^isub>n"} that the following equations
 %
 \begin{equation}\label{inv2}
 @{text "X\<^isub>1 = L(Y\<^isub>i\<^isub>1, CHAR c\<^isub>i\<^isub>1) \<union> \<dots> \<union> L(Y\<^isub>i\<^isub>p, CHAR c\<^isub>i\<^isub>p) \<union> L(\<lambda>(EMPTY))"}.
 \end{equation}
 \noindent
-hold. The reason for adding the @{text \<lambda>}-marker to our equational system is
+The reason for adding the @{text \<lambda>}-marker to our equational system is
-to obtain this equations, which only holds in this form since none of
+to obtain this equation, which only holds in this form since none of
-the other terms contain the empty string. Our proof of Thm.~\ref{myhillnerodeone}
+the other terms contain the empty string.
+Our proof of Thm.~\ref{myhillnerodeone}
 will be by transforming the equational system into a \emph{solved form}
 maintaining the invariants \eqref{inv1} and \eqref{inv2}. From the solved
 form we will be able to read off the regular expressions using our
 variant of Arden's Lemma (Lem.~\ref{arden}).

changeset 77	63bc9f9d96ba
parent 75	d63baacbdb16
child 79	bba9c80735f9