--- a/Journal/Paper.thy Wed Aug 17 17:36:19 2011 +0000
+++ b/Journal/Paper.thy Fri Aug 19 06:57:57 2011 +0000
@@ -150,17 +150,17 @@
A popular choice for a theorem prover would be one based on Higher-Order
Logic (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development
- presented in this paper we will use the latter. HOL is a predicate calculus
+ presented in this paper we will use the Isabelle/HOL. HOL is a predicate calculus
that allows quantification over predicate variables. Its type system is
based on Church's Simple Theory of Types \cite{Church40}. Although many
mathematical concepts can be conveniently expressed in HOL, there are some
limitations that hurt badly, if one attempts a simple-minded formalisation
of regular languages in it.
- The typical approach to regular languages is to introduce finite
- deterministic automata and then define everything in terms of them \cite{
- HopcroftUllman69,Kozen97}. For example, a regular language is normally
- defined as:
+ The typical approach (for example \cite{HopcroftUllman69,Kozen97}) to
+ regular languages is to introduce finite deterministic automata and then
+ define everything in terms of them. For example, a regular language is
+ normally defined as:
\begin{dfntn}\label{baddef}
A language @{text A} is \emph{regular}, provided there is a
@@ -315,8 +315,8 @@
larger formalisations of automata theory are carried out in Nuprl
\cite{Constable00} and in Coq \cite{Filliatre97}.
- Also one might consider automata theory and regular languages as a well-worn
- stock subject where everything is crystal clear. However, paper proofs about
+ Also one might consider automata theory and regular languages as a meticulously
+ researched subject where everything is crystal clear. However, paper proofs about
automata often involve subtle side-conditions which are easily overlooked,
but which make formal reasoning rather painful. For example Kozen's proof of
the Myhill-Nerode theorem requires that automata do not have inaccessible
@@ -440,11 +440,11 @@
involving equivalence classes of languages. For this we will use Arden's Lemma
(see for example \cite[Page 100]{Sakarovitch09}),
which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
- @{term "[] \<notin> A"}. However we will need the following `reverse'
- version of Arden's Lemma (`reverse' in the sense of changing the order of @{term "A \<cdot> X"} to
+ @{term "[] \<notin> A"}. However we will need the following `reversed'
+ version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \<cdot> X"} to
\mbox{@{term "X \<cdot> A"}}).
- \begin{lmm}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
+ \begin{lmm}[Reversed Arden's Lemma]\label{arden}\mbox{}\\
If @{thm (prem 1) arden} then
@{thm (lhs) arden} if and only if
@{thm (rhs) arden}.
@@ -456,7 +456,8 @@
we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"},
which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both
sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side
- is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. This completes this direction.
+ is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. Applying the assumed equation
+ completes this direction.
For the other direction we assume @{thm (lhs) arden}. By a simple induction
on @{text n}, we can establish the property
@@ -540,7 +541,6 @@
text {*
\noindent
- \footnote{Folklore: Henzinger (arden-DFA-regexp.pdf); Hofmann}
The key definition in the Myhill-Nerode theorem is the
\emph{Myhill-Nerode relation}, which states that w.r.t.~a language two
strings are related, provided there is no distinguishing extension in this
@@ -662,7 +662,7 @@
way, because the Myhill-Nerode relation determines the `direction' of the
transitions---the successor `state' of an equivalence class @{text Y} can
be reached by adding a character to the end of @{text Y}. This is also the
- reason why we have to use our reverse version of Arden's Lemma.}
+ reason why we have to use our reversed version of Arden's Lemma.}
In our running example we have the initial equational system
\begin{equation}\label{exmpcs}
@@ -794,7 +794,7 @@
\noindent
- That means we eliminated the dependency of @{text "X\<^isub>3"} on the
+ That means we eliminated the recursive occurrence of @{text "X\<^isub>3"} on the
right-hand side. Note we used the abbreviation
@{text "\<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\<dots>, ATOM c\<^isub>m}"}
to stand for a regular expression that matches with every character. In
@@ -1224,7 +1224,7 @@
\begin{proof}
We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have
@{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be
- finite by assumption. Now @{term "f (UNIV // =tag=)"} is a subset of @{text "range f"},
+ finite by assumption. Now @{term "f ` (UNIV // =tag=)"} is a subset of @{text "range f"},
and so also finite. Injectivity amounts to showing that @{text "X = Y"} under the
assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}.
From the assumptions we obtain \mbox{@{text "x \<in> X"}} and @{text "y \<in> Y"} with
@@ -1899,7 +1899,7 @@
Let us now return to our proof for the second direction in the Myhill-Nerode
theorem. The point of the above calculations is to use
- @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation.
+ @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. pders x r)\<^esub>"} as tagging-relation.
\begin{proof}[Proof of Theorem~\ref{myhillnerodetwo} (second version)]
@@ -1911,7 +1911,7 @@
\end{center}
\noindent
- which means the tagging-relation @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} refines @{term "\<approx>(lang r)"}.
+ which means the tagging-relation @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. pders x r)\<^esub>"} refines @{term "\<approx>(lang r)"}.
So we know by Lemma~\ref{fintwo}, \mbox{@{term "finite (UNIV // (\<approx>(lang r)))"}}
holds if @{term "finite (UNIV // (=(\<lambda>x. pders x r)=))"}. In order to establish
the latter, we can use Lemma~\ref{finone} and show that the range of the
@@ -1925,7 +1925,7 @@
\noindent
Now the range of @{term "\<lambda>x. pders x r"} is a subset of @{term "Pow (pders_lang UNIV r)"},
which we know is finite by Theorem~\ref{antimirov}. Consequently there
- are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"},
+ are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. pders x r)\<^esub>"},
which refines @{term "\<approx>(lang r)"}, and therefore we can again conclude the
second part of the Myhill-Nerode theorem.
\end{proof}
@@ -2107,8 +2107,8 @@
algebraic method} used to convert a finite automaton to a regular expression
\cite{Brzozowski64}. The close connection can be seen by considering the
equivalence classes as the states of the minimal automaton for the regular
- language. However there are some subtle differences. Since we identify
- equivalence classes with the states of the automaton, then the most natural
+ language. However there are some subtle differences. Because our equivalence
+ classes (or correspondingly states) arise from the Myhil-Nerode Relation, the most natural
choice is to characterise each state with the set of strings starting from
the initial state leading up to that state. Usually, however, the states are
characterised as the strings starting from that state leading to the
@@ -2116,7 +2116,7 @@
equational system is set up. We have the $\lambda$-term on our `initial
state', while Brzozowski has it on the terminal states. This means we also
need to reverse the direction of Arden's Lemma. We have not found anything
- in the literature about this way of proving the first direction of the
+ in the literature about our way of proving the first direction of the
Myhill-Nerode theorem, but it appears to be folklore.
We presented two proofs for the second direction of the Myhill-Nerode
@@ -2151,7 +2151,7 @@
Theorem~\ref{antimirov}). However, since partial derivatives use sets of
regular expressions, one needs to carefully analyse whether the resulting
algorithm is still executable. Given the existing infrastructure for
- executable sets in Isabelle/HOL, it should.
+ executable sets in Isabelle/HOL \cite{Haftmann09}, it should.
Our formalisation of the Myhill-Nerode theorem consists of 780 lines of
Isabelle/Isar code for the first direction and 460 for the second (the one
--- a/Journal/document/root.bib Wed Aug 17 17:36:19 2011 +0000
+++ b/Journal/document/root.bib Fri Aug 19 06:57:57 2011 +0000
@@ -1,3 +1,13 @@
+
+
+
+@PhdThesis{Haftmann09,
+ author = {F.~Haftmann},
+ title = {{C}ode {G}eneration from {S}pecifications in {H}igher-{O}rder {L}ogic},
+ school = {Technical University of Munich},
+ year = {2009}
+}
+
@article{Harper99,
author = {R.~Harper},
title = {{P}roof-{D}irected {D}ebugging},