regexp: comparison Journal/Paper.thy

equal deleted inserted replaced

-:204856ef5573
+:9fbf6d9f85ae
 HopcroftUllman69}). It seems natural to exercise theorem provers by
 formalising the theorems and by verifying formally the algorithms.
 A popular choice for a theorem prover would be one based on Higher-Order
 Logic (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development
-presented in this paper we will use the latter. HOL is a predicate calculus
+presented in this paper we will use the Isabelle/HOL. HOL is a predicate calculus
 that allows quantification over predicate variables. Its type system is
 based on Church's Simple Theory of Types \cite{Church40}.  Although many
 mathematical concepts can be conveniently expressed in HOL, there are some
 limitations that hurt badly, if one attempts a simple-minded formalisation
 of regular languages in it.
-The typical approach to regular languages is to introduce finite
+The typical approach (for example \cite{HopcroftUllman69,Kozen97}) to
-deterministic automata and then define everything in terms of them \cite{
+regular languages is to introduce finite deterministic automata and then
-HopcroftUllman69,Kozen97}.  For example, a regular language is normally
+define everything in terms of them.  For example, a regular language is
-defined as:
+normally defined as:
 \begin{dfntn}\label{baddef}
 A language @{text A} is \emph{regular}, provided there is a
 finite deterministic automaton that recognises all strings of @{text "A"}.
 \end{dfntn}
 lexing. Berghofer and Reiter \cite{BerghoferReiter09} formalise automata
 working over bit strings in the context of Presburger arithmetic.  The only
 larger formalisations of automata theory are carried out in Nuprl
 \cite{Constable00} and in Coq \cite{Filliatre97}.
-Also one might consider automata theory and regular languages as a well-worn
+Also one might consider automata theory and regular languages as a meticulously
-stock subject where everything is crystal clear. However, paper proofs about
+researched subject where everything is crystal clear. However, paper proofs about
 automata often involve subtle side-conditions which are easily overlooked,
 but which make formal reasoning rather painful. For example Kozen's proof of
 the Myhill-Nerode theorem requires that automata do not have inaccessible
 states \cite{Kozen97}. Another subtle side-condition is completeness of
 automata, that is automata need to have total transition functions and at
 Central to our proof will be the solution of equational systems
 involving equivalence classes of languages. For this we will use Arden's Lemma
 (see for example \cite[Page 100]{Sakarovitch09}),
 which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
-@{term "[] \<notin> A"}. However we will need the following `reverse'
+@{term "[] \<notin> A"}. However we will need the following `reversed'
-version of Arden's Lemma (`reverse' in the sense of changing the order of @{term "A \<cdot> X"} to
+version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \<cdot> X"} to
 \mbox{@{term "X \<cdot> A"}}).
-\begin{lmm}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
+\begin{lmm}[Reversed Arden's Lemma]\label{arden}\mbox{}\\
 If @{thm (prem 1) arden} then
 @{thm (lhs) arden} if and only if
 @{thm (rhs) arden}.
 \end{lmm}
 For the right-to-left direction we assume @{thm (rhs) arden} and show
 that @{thm (lhs) arden} holds. From Property~\ref{langprops}@{text "(i)"}
 we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"},
 which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both
 sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side
-is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. This completes this direction.
+is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. Applying the assumed equation
+completes this direction.
 For the other direction we assume @{thm (lhs) arden}. By a simple induction
 on @{text n}, we can establish the property
 \begin{center}
 section {* The Myhill-Nerode Theorem, First Part *}
 text {*
 \noindent
-\footnote{Folklore: Henzinger (arden-DFA-regexp.pdf); Hofmann}
 The key definition in the Myhill-Nerode theorem is the
 \emph{Myhill-Nerode relation}, which states that w.r.t.~a language two
 strings are related, provided there is no distinguishing extension in this
 language. This can be defined as a tertiary relation.
 the method by Brzozowski \cite{Brzozowski64}, where he marks the
 `terminal' states. We are forced to set up the equational system in our
 way, because the Myhill-Nerode relation determines the `direction' of the
 transitions---the successor `state' of an equivalence class @{text Y} can
 be reached by adding a character to the end of @{text Y}. This is also the
-reason why we have to use our reverse version of Arden's Lemma.}
+reason why we have to use our reversed version of Arden's Lemma.}
 In our running example we have the initial equational system
 \begin{equation}\label{exmpcs}
 \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
 @{term "X\<^isub>1"} & @{text "="} & @{text "\<lambda>(ONE)"}\\
 \end{tabular}
 \end{center}
 \noindent
-That means we eliminated the dependency of @{text "X\<^isub>3"} on the
+That means we eliminated the recursive occurrence of @{text "X\<^isub>3"} on the
 right-hand side.  Note we used the abbreviation
 @{text "\<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\<dots>, ATOM c\<^isub>m}"}
 to stand for a regular expression that matches with every character. In
 our algorithm we are only interested in the existence of such a regular expression
 and do not specify it any further.
 \end{lmm}
 \begin{proof}
 We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have
 @{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be
-finite by assumption. Now @{term "f (UNIV // =tag=)"} is a subset of @{text "range f"},
+finite by assumption. Now @{term "f ` (UNIV // =tag=)"} is a subset of @{text "range f"},
 and so also finite. Injectivity amounts to showing that @{text "X = Y"} under the
 assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}.
 From the assumptions we obtain \mbox{@{text "x \<in> X"}} and @{text "y \<in> Y"} with
 @{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in
 turn means that the equivalence classes @{text X}
 \eqref{pdersunivone} involve some non-routine induction arguments), we omit
 the details.
 Let us now return to our proof for the second direction in the Myhill-Nerode
 theorem. The point of the above calculations is to use
-@{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation.
+@{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. pders x r)\<^esub>"} as tagging-relation.
 \begin{proof}[Proof of Theorem~\ref{myhillnerodetwo} (second version)]
 Using \eqref{mhders}
 and \eqref{Derspders} we can easily infer that
 \begin{center}
 @{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "pders x r = pders y r"}
 \end{center}
 \noindent
-which means the tagging-relation @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} refines @{term "\<approx>(lang r)"}.
+which means the tagging-relation @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. pders x r)\<^esub>"} refines @{term "\<approx>(lang r)"}.
 So we know by Lemma~\ref{fintwo}, \mbox{@{term "finite (UNIV // (\<approx>(lang r)))"}}
 holds if @{term "finite (UNIV // (=(\<lambda>x. pders x r)=))"}. In order to establish
 the latter, we can use Lemma~\ref{finone} and show that the range of the
 tagging-function \mbox{@{term "\<lambda>x. pders x r"}} is finite. For this recall Definition
 \ref{Pdersdef}, which gives us that
 \end{center}
 \noindent
 Now the range of @{term "\<lambda>x. pders x r"} is a subset of @{term "Pow (pders_lang UNIV r)"},
 which we know is finite by Theorem~\ref{antimirov}. Consequently there
-are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"},
+are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. pders x r)\<^esub>"},
 which refines @{term "\<approx>(lang r)"}, and therefore we can again conclude the
 second part of the Myhill-Nerode theorem.
 \end{proof}
 *}
 Our proof of the first direction is very much inspired by \emph{Brzozowski's
 algebraic method} used to convert a finite automaton to a regular expression
 \cite{Brzozowski64}. The close connection can be seen by considering the
 equivalence classes as the states of the minimal automaton for the regular
-language.  However there are some subtle differences. Since we identify
+language.  However there are some subtle differences. Because our equivalence
-equivalence classes with the states of the automaton, then the most natural
+classes (or correspondingly states) arise from the Myhil-Nerode Relation, the most natural
 choice is to characterise each state with the set of strings starting from
 the initial state leading up to that state. Usually, however, the states are
 characterised as the strings starting from that state leading to the
 terminal states.  The first choice has consequences about how the initial
 equational system is set up. We have the $\lambda$-term on our `initial
 state', while Brzozowski has it on the terminal states. This means we also
 need to reverse the direction of Arden's Lemma. We have not found anything
-in the literature about this way of proving the first direction of the
+in the literature about our way of proving the first direction of the
 Myhill-Nerode theorem, but it appears to be folklore.
 We presented two proofs for the second direction of the Myhill-Nerode
 theorem. One direct proof using tagging-functions and another using partial
 derivatives. This part of our work is where our method using regular
 instead of derivatives, and that termination of the algorithm can be
 formally established (the main incredience is
 Theorem~\ref{antimirov}). However, since partial derivatives use sets of
 regular expressions, one needs to carefully analyse whether the resulting
 algorithm is still executable. Given the existing infrastructure for
-executable sets in Isabelle/HOL, it should.
+executable sets in Isabelle/HOL \cite{Haftmann09}, it should.
 Our formalisation of the Myhill-Nerode theorem consists of 780 lines of
 Isabelle/Isar code for the first direction and 460 for the second (the one
 based on tagging-functions), plus around 300 lines of standard material
 about regular languages. The formalisation of derivatives and partial

changeset 201	9fbf6d9f85ae
parent 200	204856ef5573
child 203	5d724fe0e096