Journal/Paper.thy
changeset 392 87d3306acca8
parent 388 0da31edd95b9
equal deleted inserted replaced
391:5c283ecefda6 392:87d3306acca8
   494   \noindent 
   494   \noindent 
   495   {\bf Contributions:} There is an extensive literature on regular languages.
   495   {\bf Contributions:} There is an extensive literature on regular languages.
   496   To our best knowledge, our proof of the Myhill-Nerode Theorem is the first
   496   To our best knowledge, our proof of the Myhill-Nerode Theorem is the first
   497   that is based on regular expressions, only. The part of this theorem stating
   497   that is based on regular expressions, only. The part of this theorem stating
   498   that finitely many partitions imply regularity of the language is proved by
   498   that finitely many partitions imply regularity of the language is proved by
   499   an argument about solving equational systems.  This argument appears to be
   499   an argument about solving equational systems.  This argument requires a
   500   folklore. For the other part, we give two proofs: one direct proof using
   500   `reversed' version of Arden's Lemma. For the other part, we give two proofs: one direct proof using
   501   certain tagging-functions, and another indirect proof using Antimirov's
   501   certain tagging-functions, and another indirect proof using Antimirov's
   502   partial derivatives (\citeyear{Antimirov95}). Again to our best knowledge, the
   502   partial derivatives (\citeyear{Antimirov95}). Again to our best knowledge, the
   503   tagging-functions have not been used before for establishing the Myhill-Nerode
   503   tagging-functions have not been used before for establishing the Myhill-Nerode
   504   Theorem. Derivatives of regular expressions have been used recently quite
   504   Theorem. Derivatives of regular expressions have been used recently quite
   505   widely in the literature; partial derivatives, in contrast, attract much
   505   widely in the literature; partial derivatives, in contrast, attract much
  2500   regular---by establishing that it has infinitely many equivalence classes
  2500   regular---by establishing that it has infinitely many equivalence classes
  2501   generated by the Myhill-Nerode Relation (this is usually the purpose of the
  2501   generated by the Myhill-Nerode Relation (this is usually the purpose of the
  2502   Pumping Lemma).  We can also use it to establish the standard
  2502   Pumping Lemma).  We can also use it to establish the standard
  2503   textbook results about closure properties of regular languages. The case of 
  2503   textbook results about closure properties of regular languages. The case of 
  2504   closure under complement follows easily from the Myhill-Nerode Theorem.
  2504   closure under complement follows easily from the Myhill-Nerode Theorem.
  2505   So our answer to Gasarch is ``{\it yes we can''}!  
  2505   So our answer to Gasarch is `{\it yes we can'}!  
  2506 
  2506 
  2507   %Our insistence on regular expressions for proving the Myhill-Nerode Theorem
  2507   %Our insistence on regular expressions for proving the Myhill-Nerode Theorem
  2508   %arose from the problem of defining formally the regularity of a language.
  2508   %arose from the problem of defining formally the regularity of a language.
  2509   %In order to guarantee consistency,
  2509   %In order to guarantee consistency,
  2510   %formalisations in HOL can only extend the logic with definitions that introduce a new concept in
  2510   %formalisations in HOL can only extend the logic with definitions that introduce a new concept in
  2549   using automata, poses no problem as shown by \citeN{OwensReppyTuron09}.  For an
  2549   using automata, poses no problem as shown by \citeN{OwensReppyTuron09}.  For an
  2550   implementation of a simple regular expression matcher, whose correctness has
  2550   implementation of a simple regular expression matcher, whose correctness has
  2551   been formally established, we refer the reader to 
  2551   been formally established, we refer the reader to 
  2552   \citeN{OwensSlind08}. In our opinion, their formalisation is considerably
  2552   \citeN{OwensSlind08}. In our opinion, their formalisation is considerably
  2553   slicker than for example the approach to regular expression matching taken
  2553   slicker than for example the approach to regular expression matching taken
  2554   by \citeN{Harper99} and by \citeN{Yi06}.
  2554   by \citeN{Harper99} and by \citeN{Yi06}. 
  2555 
  2555 
       
  2556  
  2556   Our proof of the first direction is very much inspired by \emph{Brzozowski's
  2557   Our proof of the first direction is very much inspired by \emph{Brzozowski's
  2557   algebraic method} \citeyear{Brzozowski64} used to convert a finite automaton to a regular expression. 
  2558   algebraic method} \citeyear{Brzozowski64} used to convert a finite automaton to a regular expression. 
  2558   The close connection can be seen by considering the
  2559   The close connection can be seen by considering the
  2559   equivalence classes as the states of the minimal automaton for the regular
  2560   equivalence classes as the states of the minimal automaton for the regular
  2560   language.  However there are some subtle differences. Because our equivalence 
  2561   language.  However there are some subtle differences. Because our equivalence 
  2564   characterised as the strings starting from that state leading to the
  2565   characterised as the strings starting from that state leading to the
  2565   terminal states.  The first choice has consequences about how the initial
  2566   terminal states.  The first choice has consequences about how the initial
  2566   equational system is set up. We have the $\lambda$-term on our `initial
  2567   equational system is set up. We have the $\lambda$-term on our `initial
  2567   state', while Brzozowski has it on the terminal states. This means we also
  2568   state', while Brzozowski has it on the terminal states. This means we also
  2568   need to reverse the direction of Arden's Lemma. We have not found anything
  2569   need to reverse the direction of Arden's Lemma. We have not found anything
  2569   in the `pencil-and-paper-reasoning' literature about our way of proving the 
  2570   in the literature about our way of proving the 
  2570   first direction of the Myhill-Nerode Theorem, but it appears to be folklore.
  2571   first direction of the Myhill-Nerode Theorem.
  2571 
  2572 
  2572   We presented two proofs for the second direction of the
  2573   We presented two proofs for the second direction of the
  2573   Myhill-Nerode Theorem. One direct proof using tagging-functions and
  2574   Myhill-Nerode Theorem. One direct proof using tagging-functions and
  2574   another using partial derivatives. This part of our work is where
  2575   another using partial derivatives. This part of our work is where
  2575   our method using regular expressions shines, because we can perform
  2576   our method using regular expressions shines, because we can perform
  2675   then regular expressions are easier to work with formally. 
  2676   then regular expressions are easier to work with formally. 
  2676   The code of
  2677   The code of
  2677   our formalisation \cite{myhillnerodeafp11} can be found in the Archive of Formal Proofs at 
  2678   our formalisation \cite{myhillnerodeafp11} can be found in the Archive of Formal Proofs at 
  2678   \mbox{\url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}}.
  2679   \mbox{\url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}}.
  2679   
  2680   
       
  2681    In this paper we leave out a discussion
       
  2682   about the computational content of our proofs. While the described algorithms
       
  2683   (for instance for solving our equational systems) are exectuable as is, we 
       
  2684   have cut some `computational corners' in our formalisation. For example, we did 
       
  2685   not specify which particular equation should be chosen in each step, rather 
       
  2686   proved that any choice will do. Similarly with our $\bigplus$-operation, we 
       
  2687   did not specify the order in which the regular expression should be composed, rather
       
  2688   proved that any composition with @{term PLUS} will do.
       
  2689  
       
  2690 
  2680   Our future work will focus on formalising the regular expression matchers
  2691   Our future work will focus on formalising the regular expression matchers
  2681   developed by \citeN{Sulzmann12} which generate variable assignments for
  2692   developed by \citeN{Sulzmann12} which generate variable assignments for
  2682   regular expression sub-matching. For this they use both, derivatives and
  2693   regular expression sub-matching. For this they use both, derivatives and
  2683   partial derivatives of regular expressions.\smallskip
  2694   partial derivatives of regular expressions.\smallskip
  2684 
  2695