330 @{text "A\<^sub>1 \<uplus> A\<^sub>2 \<equiv> {(1, x) | x \<in> A\<^sub>1} \<union> {(2, y) | y \<in> A\<^sub>2}"} |
330 @{text "A\<^sub>1 \<uplus> A\<^sub>2 \<equiv> {(1, x) | x \<in> A\<^sub>1} \<union> {(2, y) | y \<in> A\<^sub>2}"} |
331 \end{equation} |
331 \end{equation} |
332 |
332 |
333 \noindent |
333 \noindent |
334 changes the type---the disjoint union is not a set, but a set of |
334 changes the type---the disjoint union is not a set, but a set of |
335 pairs. Using this definition for disjoint union means we do not have a |
335 pairs. Using this definition for disjoint union means we do not have |
336 single type for the states of automata. As a result we will not be able to |
336 a single type for the states of automata. As a result we will not be |
337 define a regular language as one for which there exists |
337 able to define a regular language as one for which there exists an |
338 an automaton that recognises all its strings (Definition~\ref{baddef}). This |
338 automaton that recognises all its strings |
339 is because we cannot make a definition in HOL that is only polymorphic in |
339 (Definition~\ref{baddef}). This is because we cannot make a |
340 the state type, but not in the predicate for regularity; and there is no |
340 definition in HOL that is only polymorphic in the state type, but |
341 type quantification available in HOL (unlike in Coq, for |
341 not in the predicate for regularity; and there is no type |
342 example).\footnote{Slind already pointed out this problem in an email to the |
342 quantification available in HOL.\footnote{Slind already pointed out |
343 HOL4 mailing list on 21st April 2005.} |
343 this problem in an email to the HOL4 mailing list on 21st April |
344 %$^,$\footnote{While in Coq one can avoid |
344 2005.} Coq, for example, has quantification over types and thus can |
345 %this particular problem, all other difficulties we point out below still apply.} |
345 state such a definition. This has been recently exploited in a |
|
346 slick formalisation of the Myhill-Nerode theorem in Coq by |
|
347 \citeN{XXX}. |
346 |
348 |
347 An alternative, which provides us with a single type for states of automata, |
349 An alternative, which provides us with a single type for states of automata, |
348 is to give every state node an identity, for example a natural number, and |
350 is to give every state node an identity, for example a natural number, and |
349 then be careful to rename these identities apart whenever connecting two |
351 then be careful to rename these identities apart whenever connecting two |
350 automata. This results in clunky proofs establishing that properties are |
352 automata. This results in clunky proofs establishing that properties are |
2558 |
2560 |
2559 We presented two proofs for the second direction of the |
2561 We presented two proofs for the second direction of the |
2560 Myhill-Nerode Theorem. One direct proof using tagging-functions and |
2562 Myhill-Nerode Theorem. One direct proof using tagging-functions and |
2561 another using partial derivatives. This part of our work is where |
2563 another using partial derivatives. This part of our work is where |
2562 our method using regular expressions shines, because we can perform |
2564 our method using regular expressions shines, because we can perform |
2563 an induction on the structure of refular expressions. However, it is |
2565 an induction on the structure of regular expressions. However, it is |
2564 also the direction where we had to spend most of the `conceptual' |
2566 also the direction where we had to spend most of the `conceptual' |
2565 time, as our first proof based on tagging-functions is new for |
2567 time, as our first proof based on tagging-functions is new for |
2566 establishing the Myhill-Nerode Theorem. All standard proofs of this |
2568 establishing the Myhill-Nerode Theorem. All standard proofs of this |
2567 direction proceed by arguments over automata. |
2569 direction proceed by arguments over automata. |
2568 |
2570 |
2660 from all these comparisons is that if one is interested in formalising |
2662 from all these comparisons is that if one is interested in formalising |
2661 results from regular language theory, not results from automata theory, |
2663 results from regular language theory, not results from automata theory, |
2662 then regular expressions are easier to work with formally. |
2664 then regular expressions are easier to work with formally. |
2663 The code of |
2665 The code of |
2664 our formalisation \cite{myhillnerodeafp11} can be found in the Archive of Formal Proofs at |
2666 our formalisation \cite{myhillnerodeafp11} can be found in the Archive of Formal Proofs at |
2665 \mbox{\url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}}.\smallskip |
2667 \mbox{\url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}}. |
2666 |
2668 |
|
2669 Our future work will focus on formalising the regular expression matchers |
|
2670 developed by \citeN{Sulzmann12} which generate variable assignments for |
|
2671 regular expression submatching.\smallskip |
|
2672 |
2667 \noindent |
2673 \noindent |
2668 {\bf Acknowledgements:} |
2674 {\bf Acknowledgements:} |
2669 We are grateful for the comments we received from Larry Paulson. Tobias |
2675 We are grateful for the comments we received from Larry Paulson. Tobias |
2670 Nipkow made us aware of the properties in Theorem~\ref{subseqreg} and Tjark |
2676 Nipkow made us aware of the properties in Theorem~\ref{subseqreg} and Tjark |
2671 Weber helped us with proving them. Christian Sternagel provided us with a |
2677 Weber helped us with proving them. Christian Sternagel provided us with a |