--- a/Journal/Paper.thy Mon Mar 04 21:01:55 2013 +0000
+++ b/Journal/Paper.thy Fri Jul 05 12:07:48 2013 +0100
@@ -2452,62 +2452,82 @@
algorithm is still executable. Given the infrastructure for
executable sets introduced by \citeN{Haftmann09} in Isabelle/HOL, it should.
- We started out by claiming that in a theorem prover it is eaiser to reason
- about regular expressions than about automta. Here are some numbers:
- Our formalisation of the Myhill-Nerode Theorem consists of 780 lines of
- Isabelle/Isar code for the first direction and 460 for the second (the one
- based on tagging-functions), plus around 300 lines of standard material
- about regular languages. The formalisation of derivatives and partial
- derivatives shown in Section~\ref{derivatives} consists of 390 lines of
- code. The closure properties in Section~\ref{closure} (except Theorem~\ref{subseqreg})
- can be established in
- 100 lines of code. The Continuation Lemma and the non-regularity of @{text "a\<^sup>n b\<^sup>n"}
- require 70 lines of code.
- The algorithm for solving equational systems, which we
- used in the first direction, is conceptually relatively simple. Still the
- use of sets over which the algorithm operates means it is not as easy to
- formalise as one might wish. However, it seems sets cannot be avoided since
- the `input' of the algorithm consists of equivalence classes and we cannot
- see how to reformulate the theory so that we can use lists or matrices. Lists would be
- much easier to reason about, since we can define functions over them by
- recursion. For sets we have to use set-comprehensions, which is slightly
- unwieldy. Matrices would allow us to use the slick formalisation by
- \citeN{Nipkow11} of the Gauss-Jordan algorithm.
+ We started out by claiming that in a theorem prover it is eaiser to
+ reason about regular expressions than about automta. Here are some
+ numbers: Our formalisation of the Myhill-Nerode Theorem consists of
+ 780 lines of Isabelle/Isar code for the first direction and 460 for
+ the second (the one based on tagging-functions), plus around 300
+ lines of standard material about regular languages. The
+ formalisation of derivatives and partial derivatives shown in
+ Section~\ref{derivatives} consists of 390 lines of code. The
+ closure properties in Section~\ref{closure} (except
+ Theorem~\ref{subseqreg}) can be established in 100 lines of
+ code. The Continuation Lemma and the non-regularity of @{text "a\<^sup>n
+ b\<^sup>n"} require 70 lines of code. The algorithm for solving equational
+ systems, which we used in the first direction, is conceptually
+ relatively simple. Still the use of sets over which the algorithm
+ operates means it is not as easy to formalise as one might
+ wish. However, it seems sets cannot be avoided since the `input' of
+ the algorithm consists of equivalence classes and we cannot see how
+ to reformulate the theory so that we can use lists or
+ matrices. Lists would be much easier to reason about, since we can
+ define functions over them by recursion. For sets we have to use
+ set-comprehensions, which is slightly unwieldy. Matrices would allow
+ us to use the slick formalisation by \citeN{Nipkow11} of the
+ Gauss-Jordan algorithm.
- While our formalisation might appear large, it should be seen
- in the context of the work done by \citeN{Constable00} who
- formalised the Myhill-Nerode Theorem in Nuprl using automata. They write
- that their four-member team would need something on the magnitude of 18 months
- for their formalisation of the first eleven chapters of the textbook by \citeN{HopcroftUllman69},
- which includes the Myhill-Nerode theorem. It is hard to gauge the size of a
- formalisation in Nurpl, but from what is shown in the Nuprl Math Library
- about their development it seems \emph{substantially} larger than ours. We attribute
- this to our use of regular expressions, which meant we did not need to `fight'
- the theorem prover. Recently, \citeN{LammichTuerk12} formalised Hopcroft's
- algorithm in Isabelle/HOL (in 7000 lines of code) using an automata
- library of 27000 lines of code.
- Also, \citeN{Filliatre97} reports that his formalisation in
- Coq of automata theory and Kleene's theorem is ``rather big''.
- \citeN{Almeidaetal10} reported about another
- formalisation of regular languages in Coq. Their
- main result is the
+ % OLD
+ %While our formalisation might appear large, it should be seen in the
+ %context of the work done by \citeN{Constable00} who formalised the
+ %Myhill-Nerode Theorem in Nuprl using automata. They write that their
+ %four-member team would need something on the magnitude of 18 months
+ %for their formalisation of the first eleven chapters of the textbook
+ %by \citeN{HopcroftUllman69}, which includes the Myhill-Nerode
+ %theorem. It is hard to gauge the size of a formalisation in Nurpl,
+ %but from what is shown in the Nuprl Math Library about their
+ %development it seems \emph{substantially} larger than ours. We
+ %attribute this to our use of regular expressions, which meant we did
+ %not need to `fight' the theorem prover.
+
+ %%% NEW
+ While our formalisation might appear large, it should be seen in the
+ context of the work done by \citeN{Constable00} who formalised the
+ Myhill-Nerode Theorem in Nuprl using automata. They choose to formalise the
+ this theorem, because it gives them state minimization of automata
+ as a corollary. It is hard to gauge the size of a formalisation in Nurpl,
+ but from what is shown in the Nuprl Math Library about this
+ development it seems \emph{substantially} larger than ours. We
+ attribute this to our use of regular expressions, which meant we did
+ not need to `fight' the theorem prover.
+ %
+ Recently,
+ \citeN{LammichTuerk12} formalised Hopcroft's algorithm in
+ Isabelle/HOL (in 7000 lines of code) using an automata library of
+ 27000 lines of code. Also, \citeN{Filliatre97} reports that his
+ formalisation in Coq of automata theory and Kleene's theorem is
+ ``rather big''. \citeN{Almeidaetal10} reported about another
+ formalisation of regular languages in Coq. Their main result is the
correctness of Mirkin's construction of an automaton from a regular
- expression using partial derivatives. This took approximately 10600 lines
- of code. \citeN{Braibant12} formalised a large part of regular language
- theory and Kleene algebras in Coq. While he is mainly interested
- in implementing decision procedures for Kleene algebras, his library
- includes a proof of the Myhill-Nerode theorem. He reckons that our
- ``development is more concise'' than his one based on matrices \cite[Page 67]{Braibant12}.
- He writes that there is no conceptual problems with formally reasoning
- about matrices for automata, but notes ``intrinsic difficult[ies]'' when working
- with matrices in Coq, which is the sort of `fighting' one would encounter also
- in other theorem provers.
-
+ expression using partial derivatives. This took approximately 10600
+ lines of code. \citeN{Braibant12} formalised a large part of
+ regular language theory and Kleene algebras in Coq. While he is
+ mainly interested in implementing decision procedures for Kleene
+ algebras, his library includes a proof of the Myhill-Nerode
+ theorem. He reckons that our ``development is more concise'' than
+ his one based on matrices \cite[Page 67]{Braibant12}. He writes
+ that there is no conceptual problems with formally reasoning about
+ matrices for automata, but notes ``intrinsic difficult[ies]'' when
+ working with matrices in Coq, which is the sort of `fighting' one
+ would encounter also in other theorem provers.
In terms of time, the estimate for our formalisation is that we
needed approximately 3 months and this included the time to find our proof
arguments. Unlike \citeN{Constable00}, who were able to follow the Myhill-Nerode
- proof by \citeN{HopcroftUllman69}, we had to find our own arguments. So for us the
- formalisation was not the bottleneck. The code of
+ proof by \citeN{HopcroftUllman69}, we had to find our own arguments.
+ So for us the formalisation was not the bottleneck. The conclusion we draw
+ from all these comparisons is that if one is interested in formalising
+ results from regular language theory, not results from automata theory,
+ then regular expressions are easier to work with formally.
+ The code of
our formalisation \cite{myhillnerodeafp11} can be found in the Archive of Formal Proofs at
\mbox{\url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}}.\smallskip