494 \noindent |
494 \noindent |
495 {\bf Contributions:} There is an extensive literature on regular languages. |
495 {\bf Contributions:} There is an extensive literature on regular languages. |
496 To our best knowledge, our proof of the Myhill-Nerode Theorem is the first |
496 To our best knowledge, our proof of the Myhill-Nerode Theorem is the first |
497 that is based on regular expressions, only. The part of this theorem stating |
497 that is based on regular expressions, only. The part of this theorem stating |
498 that finitely many partitions imply regularity of the language is proved by |
498 that finitely many partitions imply regularity of the language is proved by |
499 an argument about solving equational systems. This argument appears to be |
499 an argument about solving equational systems. This argument requires a |
500 folklore. For the other part, we give two proofs: one direct proof using |
500 `reversed' version of Arden's Lemma. For the other part, we give two proofs: one direct proof using |
501 certain tagging-functions, and another indirect proof using Antimirov's |
501 certain tagging-functions, and another indirect proof using Antimirov's |
502 partial derivatives (\citeyear{Antimirov95}). Again to our best knowledge, the |
502 partial derivatives (\citeyear{Antimirov95}). Again to our best knowledge, the |
503 tagging-functions have not been used before for establishing the Myhill-Nerode |
503 tagging-functions have not been used before for establishing the Myhill-Nerode |
504 Theorem. Derivatives of regular expressions have been used recently quite |
504 Theorem. Derivatives of regular expressions have been used recently quite |
505 widely in the literature; partial derivatives, in contrast, attract much |
505 widely in the literature; partial derivatives, in contrast, attract much |
2500 regular---by establishing that it has infinitely many equivalence classes |
2500 regular---by establishing that it has infinitely many equivalence classes |
2501 generated by the Myhill-Nerode Relation (this is usually the purpose of the |
2501 generated by the Myhill-Nerode Relation (this is usually the purpose of the |
2502 Pumping Lemma). We can also use it to establish the standard |
2502 Pumping Lemma). We can also use it to establish the standard |
2503 textbook results about closure properties of regular languages. The case of |
2503 textbook results about closure properties of regular languages. The case of |
2504 closure under complement follows easily from the Myhill-Nerode Theorem. |
2504 closure under complement follows easily from the Myhill-Nerode Theorem. |
2505 So our answer to Gasarch is ``{\it yes we can''}! |
2505 So our answer to Gasarch is `{\it yes we can'}! |
2506 |
2506 |
2507 %Our insistence on regular expressions for proving the Myhill-Nerode Theorem |
2507 %Our insistence on regular expressions for proving the Myhill-Nerode Theorem |
2508 %arose from the problem of defining formally the regularity of a language. |
2508 %arose from the problem of defining formally the regularity of a language. |
2509 %In order to guarantee consistency, |
2509 %In order to guarantee consistency, |
2510 %formalisations in HOL can only extend the logic with definitions that introduce a new concept in |
2510 %formalisations in HOL can only extend the logic with definitions that introduce a new concept in |
2549 using automata, poses no problem as shown by \citeN{OwensReppyTuron09}. For an |
2549 using automata, poses no problem as shown by \citeN{OwensReppyTuron09}. For an |
2550 implementation of a simple regular expression matcher, whose correctness has |
2550 implementation of a simple regular expression matcher, whose correctness has |
2551 been formally established, we refer the reader to |
2551 been formally established, we refer the reader to |
2552 \citeN{OwensSlind08}. In our opinion, their formalisation is considerably |
2552 \citeN{OwensSlind08}. In our opinion, their formalisation is considerably |
2553 slicker than for example the approach to regular expression matching taken |
2553 slicker than for example the approach to regular expression matching taken |
2554 by \citeN{Harper99} and by \citeN{Yi06}. |
2554 by \citeN{Harper99} and by \citeN{Yi06}. |
2555 |
2555 |
|
2556 |
2556 Our proof of the first direction is very much inspired by \emph{Brzozowski's |
2557 Our proof of the first direction is very much inspired by \emph{Brzozowski's |
2557 algebraic method} \citeyear{Brzozowski64} used to convert a finite automaton to a regular expression. |
2558 algebraic method} \citeyear{Brzozowski64} used to convert a finite automaton to a regular expression. |
2558 The close connection can be seen by considering the |
2559 The close connection can be seen by considering the |
2559 equivalence classes as the states of the minimal automaton for the regular |
2560 equivalence classes as the states of the minimal automaton for the regular |
2560 language. However there are some subtle differences. Because our equivalence |
2561 language. However there are some subtle differences. Because our equivalence |
2564 characterised as the strings starting from that state leading to the |
2565 characterised as the strings starting from that state leading to the |
2565 terminal states. The first choice has consequences about how the initial |
2566 terminal states. The first choice has consequences about how the initial |
2566 equational system is set up. We have the $\lambda$-term on our `initial |
2567 equational system is set up. We have the $\lambda$-term on our `initial |
2567 state', while Brzozowski has it on the terminal states. This means we also |
2568 state', while Brzozowski has it on the terminal states. This means we also |
2568 need to reverse the direction of Arden's Lemma. We have not found anything |
2569 need to reverse the direction of Arden's Lemma. We have not found anything |
2569 in the `pencil-and-paper-reasoning' literature about our way of proving the |
2570 in the literature about our way of proving the |
2570 first direction of the Myhill-Nerode Theorem, but it appears to be folklore. |
2571 first direction of the Myhill-Nerode Theorem. |
2571 |
2572 |
2572 We presented two proofs for the second direction of the |
2573 We presented two proofs for the second direction of the |
2573 Myhill-Nerode Theorem. One direct proof using tagging-functions and |
2574 Myhill-Nerode Theorem. One direct proof using tagging-functions and |
2574 another using partial derivatives. This part of our work is where |
2575 another using partial derivatives. This part of our work is where |
2575 our method using regular expressions shines, because we can perform |
2576 our method using regular expressions shines, because we can perform |
2675 then regular expressions are easier to work with formally. |
2676 then regular expressions are easier to work with formally. |
2676 The code of |
2677 The code of |
2677 our formalisation \cite{myhillnerodeafp11} can be found in the Archive of Formal Proofs at |
2678 our formalisation \cite{myhillnerodeafp11} can be found in the Archive of Formal Proofs at |
2678 \mbox{\url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}}. |
2679 \mbox{\url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}}. |
2679 |
2680 |
|
2681 In this paper we leave out a discussion |
|
2682 about the computational content of our proofs. While the described algorithms |
|
2683 (for instance for solving our equational systems) are exectuable as is, we |
|
2684 have cut some `computational corners' in our formalisation. For example, we did |
|
2685 not specify which particular equation should be chosen in each step, rather |
|
2686 proved that any choice will do. Similarly with our $\bigplus$-operation, we |
|
2687 did not specify the order in which the regular expression should be composed, rather |
|
2688 proved that any composition with @{term PLUS} will do. |
|
2689 |
|
2690 |
2680 Our future work will focus on formalising the regular expression matchers |
2691 Our future work will focus on formalising the regular expression matchers |
2681 developed by \citeN{Sulzmann12} which generate variable assignments for |
2692 developed by \citeN{Sulzmann12} which generate variable assignments for |
2682 regular expression sub-matching. For this they use both, derivatives and |
2693 regular expression sub-matching. For this they use both, derivatives and |
2683 partial derivatives of regular expressions.\smallskip |
2694 partial derivatives of regular expressions.\smallskip |
2684 |
2695 |