Journal/Paper.thy
changeset 378 a0bcf886b8ef
parent 377 4f303da0cd2a
child 379 8c4b6fb43ebe
--- a/Journal/Paper.thy	Mon Mar 04 21:01:55 2013 +0000
+++ b/Journal/Paper.thy	Fri Jul 05 12:07:48 2013 +0100
@@ -2452,62 +2452,82 @@
   algorithm is still executable. Given the infrastructure for
   executable sets  introduced by \citeN{Haftmann09} in Isabelle/HOL, it should.
 
-  We started out by claiming that in a theorem prover it is eaiser to reason 
-  about regular expressions than about automta. Here are some numbers:
-  Our formalisation of the Myhill-Nerode Theorem consists of 780 lines of
-  Isabelle/Isar code for the first direction and 460 for the second (the one
-  based on tagging-functions), plus around 300 lines of standard material
-  about regular languages. The formalisation of derivatives and partial
-  derivatives shown in Section~\ref{derivatives} consists of 390 lines of
-  code.  The closure properties in Section~\ref{closure} (except Theorem~\ref{subseqreg}) 
-  can be established in
-  100 lines of code. The Continuation Lemma and the non-regularity of @{text "a\<^sup>n b\<^sup>n"} 
-  require 70 lines of code.
-  The algorithm for solving equational systems, which we
-  used in the first direction, is conceptually relatively simple. Still the
-  use of sets over which the algorithm operates means it is not as easy to
-  formalise as one might wish. However, it seems sets cannot be avoided since
-  the `input' of the algorithm consists of equivalence classes and we cannot
-  see how to reformulate the theory so that we can use lists or matrices. Lists would be
-  much easier to reason about, since we can define functions over them by
-  recursion. For sets we have to use set-comprehensions, which is slightly
-  unwieldy. Matrices would allow us to use the slick formalisation by 
-  \citeN{Nipkow11} of the Gauss-Jordan algorithm.
+  We started out by claiming that in a theorem prover it is eaiser to
+  reason about regular expressions than about automta. Here are some
+  numbers: Our formalisation of the Myhill-Nerode Theorem consists of
+  780 lines of Isabelle/Isar code for the first direction and 460 for
+  the second (the one based on tagging-functions), plus around 300
+  lines of standard material about regular languages. The
+  formalisation of derivatives and partial derivatives shown in
+  Section~\ref{derivatives} consists of 390 lines of code.  The
+  closure properties in Section~\ref{closure} (except
+  Theorem~\ref{subseqreg}) can be established in 100 lines of
+  code. The Continuation Lemma and the non-regularity of @{text "a\<^sup>n
+  b\<^sup>n"} require 70 lines of code.  The algorithm for solving equational
+  systems, which we used in the first direction, is conceptually
+  relatively simple. Still the use of sets over which the algorithm
+  operates means it is not as easy to formalise as one might
+  wish. However, it seems sets cannot be avoided since the `input' of
+  the algorithm consists of equivalence classes and we cannot see how
+  to reformulate the theory so that we can use lists or
+  matrices. Lists would be much easier to reason about, since we can
+  define functions over them by recursion. For sets we have to use
+  set-comprehensions, which is slightly unwieldy. Matrices would allow
+  us to use the slick formalisation by \citeN{Nipkow11} of the
+  Gauss-Jordan algorithm.
 
-  While our formalisation might appear large, it should be seen
-  in the context of the work done by \citeN{Constable00} who
-  formalised the Myhill-Nerode Theorem in Nuprl using automata. They write
-  that their four-member team would need something on the magnitude of 18 months
-  for their formalisation of the first eleven chapters of the textbook by \citeN{HopcroftUllman69}, 
-  which includes the Myhill-Nerode theorem. It is hard to gauge the size of a
-  formalisation in Nurpl, but from what is shown in the Nuprl Math Library
-  about their development it seems \emph{substantially} larger than ours. We attribute
-  this to our use of regular expressions, which meant we did not need to `fight' 
-  the theorem prover. Recently, \citeN{LammichTuerk12} formalised Hopcroft's 
-  algorithm in Isabelle/HOL (in 7000 lines of code) using an automata
-  library of 27000 lines of code.  
-  Also, \citeN{Filliatre97} reports that his formalisation in 
-  Coq of automata theory and Kleene's theorem is ``rather big''. 
-  \citeN{Almeidaetal10}  reported about another 
-  formalisation of regular languages in Coq. Their 
-  main result is the
+  % OLD
+  %While our formalisation might appear large, it should be seen in the
+  %context of the work done by \citeN{Constable00} who formalised the
+  %Myhill-Nerode Theorem in Nuprl using automata. They write that their
+  %four-member team would need something on the magnitude of 18 months
+  %for their formalisation of the first eleven chapters of the textbook
+  %by \citeN{HopcroftUllman69}, which includes the Myhill-Nerode
+  %theorem. It is hard to gauge the size of a formalisation in Nurpl,
+  %but from what is shown in the Nuprl Math Library about their
+  %development it seems \emph{substantially} larger than ours. We
+  %attribute this to our use of regular expressions, which meant we did
+  %not need to `fight' the theorem prover. 
+
+  %%% NEW
+  While our formalisation might appear large, it should be seen in the
+  context of the work done by \citeN{Constable00} who formalised the
+  Myhill-Nerode Theorem in Nuprl using automata.  They choose to formalise the 
+  this theorem, because it gives them state minimization of automata
+  as a corollary. It is hard to gauge the size of a formalisation in Nurpl,
+  but from what is shown in the Nuprl Math Library about this
+  development it seems \emph{substantially} larger than ours. We
+  attribute this to our use of regular expressions, which meant we did
+  not need to `fight' the theorem prover. 
+  %
+  Recently,
+  \citeN{LammichTuerk12} formalised Hopcroft's algorithm in
+  Isabelle/HOL (in 7000 lines of code) using an automata library of
+  27000 lines of code.  Also, \citeN{Filliatre97} reports that his
+  formalisation in Coq of automata theory and Kleene's theorem is
+  ``rather big''.  \citeN{Almeidaetal10} reported about another
+  formalisation of regular languages in Coq. Their main result is the
   correctness of Mirkin's construction of an automaton from a regular
-  expression using partial derivatives. This took approximately 10600 lines
-  of code.  \citeN{Braibant12} formalised a large part of regular language
-  theory and Kleene algebras in Coq. While he is mainly interested
-  in implementing decision procedures for Kleene algebras, his library
-  includes a proof of the Myhill-Nerode theorem. He reckons that our 
-  ``development is more concise'' than his one based on matrices \cite[Page 67]{Braibant12}.
-  He writes that there is no conceptual problems with formally reasoning
-  about matrices for automata, but notes ``intrinsic difficult[ies]'' when working 
-  with matrices in Coq, which is the sort of `fighting' one would encounter also
-  in other theorem provers.
-
+  expression using partial derivatives. This took approximately 10600
+  lines of code.  \citeN{Braibant12} formalised a large part of
+  regular language theory and Kleene algebras in Coq. While he is
+  mainly interested in implementing decision procedures for Kleene
+  algebras, his library includes a proof of the Myhill-Nerode
+  theorem. He reckons that our ``development is more concise'' than
+  his one based on matrices \cite[Page 67]{Braibant12}.  He writes
+  that there is no conceptual problems with formally reasoning about
+  matrices for automata, but notes ``intrinsic difficult[ies]'' when
+  working with matrices in Coq, which is the sort of `fighting' one
+  would encounter also in other theorem provers.
   In terms of time, the estimate for our formalisation is that we
   needed approximately 3 months and this included the time to find our proof
   arguments. Unlike \citeN{Constable00}, who were able to follow the Myhill-Nerode 
-  proof by \citeN{HopcroftUllman69}, we had to find our own arguments.  So for us the
-  formalisation was not the bottleneck.  The code of
+  proof by \citeN{HopcroftUllman69}, we had to find our own arguments.  
+  So for us the formalisation was not the bottleneck. The conclusion we draw 
+  from all these comparisons is that if one is interested in formalising
+  results from regular language theory, not results from automata theory,
+  then regular expressions are easier to work with formally. 
+  The code of
   our formalisation \cite{myhillnerodeafp11} can be found in the Archive of Formal Proofs at 
   \mbox{\url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}}.\smallskip