216 It seems natural to exercise theorem provers by |
228 It seems natural to exercise theorem provers by |
217 formalising the theorems and by verifying formally the algorithms. |
229 formalising the theorems and by verifying formally the algorithms. |
218 |
230 |
219 A popular choice for a theorem prover would be one based on Higher-Order |
231 A popular choice for a theorem prover would be one based on Higher-Order |
220 Logic (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development |
232 Logic (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development |
221 presented in this paper we will use the Isabelle/HOL. HOL is a predicate calculus |
233 presented in this paper we will use the Isabelle/HOL system. HOL is a predicate calculus |
222 that allows quantification over predicate variables. Its type system is |
234 that allows quantification over predicate variables. Its type system is |
223 based on the Simple Theory of Types by \citeN{Church40}. Although many |
235 based on the Simple Theory of Types by \citeN{Church40}. Although many |
224 mathematical concepts can be conveniently expressed in HOL, there are some |
236 mathematical concepts can be conveniently expressed in HOL, there are some |
225 limitations that hurt when attempting a simple-minded formalisation |
237 limitations that hurt when attempting a simple-minded formalisation |
226 of regular languages in it. |
238 of regular languages in it. |
229 regular languages, taken for example by \citeN{HopcroftUllman69} |
241 regular languages, taken for example by \citeN{HopcroftUllman69} |
230 and by \citeN{Kozen97}, is to introduce finite deterministic automata and then |
242 and by \citeN{Kozen97}, is to introduce finite deterministic automata and then |
231 define most notions in terms of them. For example, a regular language is |
243 define most notions in terms of them. For example, a regular language is |
232 normally defined as: |
244 normally defined as: |
233 |
245 |
234 \begin{dfntn}\label{baddef} |
246 \begin{definition}\label{baddef} |
235 A language @{text A} is \emph{regular}, provided there is a |
247 A language @{text A} is \emph{regular}, provided there is a |
236 finite deterministic automaton that recognises all strings of @{text "A"}. |
248 finite deterministic automaton that recognises all strings of @{text "A"}. |
237 \end{dfntn} |
249 \end{definition} |
238 |
250 |
239 \noindent |
251 \noindent |
240 This approach has many benefits. Among them is the fact that it is easy to |
252 This approach has many benefits. Among them is the fact that it is easy to |
241 convince oneself that regular languages are closed under complementation: |
253 convince oneself that regular languages are closed under complementation: |
242 one just has to exchange the accepting and non-accepting states in the |
254 one just has to exchange the accepting and non-accepting states in the |
386 Isabelle/HOL. There they use matrices for representing automata. |
398 Isabelle/HOL. There they use matrices for representing automata. |
387 Functions are used by \citeN{Nipkow98}, who establishes |
399 Functions are used by \citeN{Nipkow98}, who establishes |
388 the link between regular expressions and automata in the context of |
400 the link between regular expressions and automata in the context of |
389 lexing. \citeN{BerghoferReiter09} use functions as well for formalising automata |
401 lexing. \citeN{BerghoferReiter09} use functions as well for formalising automata |
390 working over bit strings in the context of Presburger arithmetic. |
402 working over bit strings in the context of Presburger arithmetic. |
391 A Larger formalisation of automata theory, including the Myhill-Nerode theorem, |
403 A larger formalisation of automata theory, including the Myhill-Nerode theorem, |
392 was carried out in Nuprl by \citeN{Constable00} which also uses functions. |
404 was carried out in Nuprl by \citeN{Constable00} which also uses functions. |
393 Other large formailsations of automata theory in Coq are by |
405 Other large formalisations of automata theory in Coq are by |
394 \citeN{Filliatre97} who essentially uses graphs and by \citeN{Almeidaetal10} |
406 \citeN{Filliatre97} who essentially uses graphs and by \citeN{Almeidaetal10} |
395 and \citeN{Braibant12}, who both use matrices. Many of these works, |
407 and \citeN{Braibant12}, who both use matrices. Many of these works, |
396 like \citeN{Nipkow98} or \citeN{Braibant12}, mention intrinsic problems with |
408 like \citeN{Nipkow98} or \citeN{Braibant12}, mention intrinsic problems with |
397 their representation of |
409 their representation of |
398 automata which made them `fight' their respective theorem prover. |
410 automata which made them `fight' their respective theorem prover. |
421 literature, but take a different approach to regular languages than is |
433 literature, but take a different approach to regular languages than is |
422 usually taken. Instead of defining a regular language as one where there |
434 usually taken. Instead of defining a regular language as one where there |
423 exists an automaton that recognises all its strings, we define a |
435 exists an automaton that recognises all its strings, we define a |
424 regular language as: |
436 regular language as: |
425 |
437 |
426 \begin{dfntn}\label{regular} |
438 \begin{definition}\label{regular} |
427 A language @{text A} is \emph{regular}, provided there is a regular expression |
439 A language @{text A} is \emph{regular}, provided there is a regular expression |
428 that matches all strings of @{text "A"}. |
440 that matches all strings of @{text "A"}. |
429 \end{dfntn} |
441 \end{definition} |
430 |
442 |
431 \noindent |
443 \noindent |
432 And then `forget' automata completely. |
444 We then want to `forget' automata completely and see how far we come |
433 The reason is that regular expressions %, unlike graphs, matrices and |
445 with formalising results from regular language theory. The reason |
434 %functions, |
446 is that regular expressions, unlike graphs, matrices and |
435 can be defined as an inductive datatype and a reasoning |
447 functions, can be defined as an inductive datatype and a reasoning |
436 infrastructure for them (like induction and recursion) comes for free in |
448 infrastructure for them (like induction and recursion) comes for |
437 HOL. %Moreover, no side-conditions will be needed for regular expressions, |
449 free in HOL. But this question whether formal language theory can be |
|
450 done without automata crops up also in non-theorem prover |
|
451 contexts. For example, Gasarch asked in the Computational Complexity |
|
452 blog by \citeN{GasarchBlog} whether the complementation of |
|
453 regular-expression languages can be proved without using |
|
454 automata. He concluded |
|
455 |
|
456 \begin{quote} |
|
457 ``{\it \ldots it can't be done}'' |
|
458 \end{quote} |
|
459 |
|
460 \noindent |
|
461 and even pondered |
|
462 |
|
463 \begin{quote} |
|
464 ``{\it \ldots [b]ut is there a rigorous way to even state that?}'' |
|
465 \end{quote} |
|
466 |
|
467 %Moreover, no side-conditions will be needed for regular expressions, |
438 %like we need for automata. |
468 %like we need for automata. |
439 This convenience of regular expressions has |
469 \noindent |
|
470 We will give an answer to these questions in this paper. |
|
471 |
|
472 The convenience of regular expressions has |
440 recently been exploited in HOL4 with a formalisation of regular expression |
473 recently been exploited in HOL4 with a formalisation of regular expression |
441 matching based on derivatives by \citeN{OwensSlind08}, and with an equivalence |
474 matching based on derivatives by \citeN{OwensSlind08}, and with an equivalence |
442 checker for regular expressions in Isabelle/HOL by \citeN{KraussNipkow11} |
475 checker for regular expressions in Isabelle/HOL by \citeN{KraussNipkow11} |
443 and in Matida by \citeN{Asperti12} and in Coq by \citeN{CoquandSiles12}. The |
476 and in Matita by \citeN{Asperti12} and in Coq by \citeN{CoquandSiles12}. The |
444 main purpose of this paper is to show that a central result about regular |
477 main purpose of this paper is to show that a central result about regular |
445 languages---the Myhill-Nerode Theorem---can be recreated by only using |
478 languages---the Myhill-Nerode Theorem---can be recreated by only using |
446 regular expressions. This theorem gives necessary and sufficient conditions |
479 regular expressions. This theorem gives necessary and sufficient conditions |
447 for when a language is regular. As a corollary of this theorem we can easily |
480 for when a language is regular. As a corollary of this theorem we can easily |
448 establish the usual closure properties, including complementation, for |
481 establish the usual closure properties, including complementation, for |
456 that is based on regular expressions, only. The part of this theorem stating |
489 that is based on regular expressions, only. The part of this theorem stating |
457 that finitely many partitions imply regularity of the language is proved by |
490 that finitely many partitions imply regularity of the language is proved by |
458 an argument about solving equational systems. This argument appears to be |
491 an argument about solving equational systems. This argument appears to be |
459 folklore. For the other part, we give two proofs: one direct proof using |
492 folklore. For the other part, we give two proofs: one direct proof using |
460 certain tagging-functions, and another indirect proof using Antimirov's |
493 certain tagging-functions, and another indirect proof using Antimirov's |
461 partial derivatives \citeyear{Antimirov95}. Again to our best knowledge, the |
494 partial derivatives (\citeyear{Antimirov95}). Again to our best knowledge, the |
462 tagging-functions have not been used before for establishing the Myhill-Nerode |
495 tagging-functions have not been used before for establishing the Myhill-Nerode |
463 Theorem. Derivatives of regular expressions have been used recently quite |
496 Theorem. Derivatives of regular expressions have been used recently quite |
464 widely in the literature; partial derivatives, in contrast, attract much |
497 widely in the literature; partial derivatives, in contrast, attract much |
465 less attention. However, partial derivatives are more suitable in the |
498 less attention. However, partial derivatives are more suitable in the |
466 context of the Myhill-Nerode Theorem, since it is easier to establish |
499 context of the Myhill-Nerode Theorem, since it is easier to |
467 formally their finiteness result. We are not aware of any proof that uses |
500 formally establish their finiteness result. We are not aware of any proof that uses |
468 either of them for proving the Myhill-Nerode Theorem. |
501 either of them for proving the Myhill-Nerode Theorem. |
469 *} |
502 *} |
470 |
503 |
471 section {* Preliminaries *} |
504 section {* Preliminaries *} |
472 |
505 |
497 where @{text "@"} is the list-append operation. The Kleene-star of a language @{text A} |
530 where @{text "@"} is the list-append operation. The Kleene-star of a language @{text A} |
498 is defined as the union over all powers, namely @{thm star_def[where A="A", THEN eq_reflection]}. |
531 is defined as the union over all powers, namely @{thm star_def[where A="A", THEN eq_reflection]}. |
499 In the paper |
532 In the paper |
500 we will make use of the following properties of these constructions. |
533 we will make use of the following properties of these constructions. |
501 |
534 |
502 \begin{prpstn}\label{langprops}\mbox{}\\ |
535 \begin{proposition}\label{langprops}\mbox{}\\ |
503 \begin{tabular}{@ {}lp{10cm}} |
536 \begin{tabular}{@ {}lp{10cm}} |
504 (i) & @{thm star_unfold_left} \\ |
537 (i) & @{thm star_unfold_left} \\ |
505 (ii) & @{thm[mode=IfThen] pow_length}\\ |
538 (ii) & @{thm[mode=IfThen] pow_length}\\ |
506 (iii) & @{thm conc_Union_left} \\ |
539 (iii) & @{thm conc_Union_left} \\ |
507 (iv) & If @{thm (prem 1) star_decom} and @{thm (prem 2) star_decom} then |
540 (iv) & If @{thm (prem 1) star_decom} and @{thm (prem 2) star_decom} then |
508 there exists an @{text "x\<^isub>p"} and @{text "x\<^isub>s"} with @{text "x = x\<^isub>p @ x\<^isub>s"} |
541 there exists an @{text "x\<^isub>p"} and @{text "x\<^isub>s"} with @{text "x = x\<^isub>p @ x\<^isub>s"} |
509 and @{term "x\<^isub>p \<noteq> []"} such that @{term "x\<^isub>p \<in> A"} and @{term "x\<^isub>s \<in> A\<star>"}. |
542 and \mbox{@{term "x\<^isub>p \<noteq> []"}} such that @{term "x\<^isub>p \<in> A"} and @{term "x\<^isub>s \<in> A\<star>"}. |
510 \end{tabular} |
543 \end{tabular} |
511 \end{prpstn} |
544 \end{proposition} |
512 |
545 |
513 \noindent |
546 \noindent |
514 In @{text "(ii)"} we use the notation @{term "length s"} for the length of a |
547 In @{text "(ii)"} we use the notation @{term "length s"} for the length of a |
515 string; this property states that if \mbox{@{term "[] \<notin> A"}} then the lengths of |
548 string; this property states that if \mbox{@{term "[] \<notin> A"}} then the lengths of |
516 the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}. |
549 the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}. |
532 involving equivalence classes of languages. For this we will use Arden's Lemma |
565 involving equivalence classes of languages. For this we will use Arden's Lemma |
533 (see for example \cite[Page 100]{Sakarovitch09}), |
566 (see for example \cite[Page 100]{Sakarovitch09}), |
534 which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided |
567 which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided |
535 @{term "[] \<notin> A"}. However we will need the following `reversed' |
568 @{term "[] \<notin> A"}. However we will need the following `reversed' |
536 version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \<cdot> X"} to |
569 version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \<cdot> X"} to |
537 \mbox{@{term "X \<cdot> A"}}).\footnote{The details of the proof for the reversed Arden's Lemma |
570 \mbox{@{term "X \<cdot> A"}}). |
538 are given in the Appendix.} |
571 %\footnote{The details of the proof for the reversed Arden's Lemma |
539 |
572 %are given in the Appendix.} |
540 \begin{lmm}[Reversed Arden's Lemma]\label{arden}\mbox{}\\ |
573 |
|
574 \begin{lemma}[Reversed Arden's Lemma]\label{arden}\mbox{}\\ |
541 If @{thm (prem 1) reversed_Arden} then |
575 If @{thm (prem 1) reversed_Arden} then |
542 @{thm (lhs) reversed_Arden} if and only if |
576 @{thm (lhs) reversed_Arden} if and only if |
543 @{thm (rhs) reversed_Arden}. |
577 @{thm (rhs) reversed_Arden}. |
544 \end{lmm} |
578 \end{lemma} |
545 |
579 |
546 \noindent |
580 \noindent |
|
581 The proof of this reversed version of Arden's lemma follows the proof of the |
|
582 original version. |
547 Regular expressions are defined as the inductive datatype |
583 Regular expressions are defined as the inductive datatype |
548 |
584 |
549 \begin{center} |
585 \begin{center} |
550 \begin{tabular}{rcl} |
586 \begin{tabular}{rcl} |
551 @{text r} & @{text "::="} & @{term ZERO}\\ |
587 @{text r} & @{text "::="} & @{term ZERO}\\ |
574 \end{tabular} |
610 \end{tabular} |
575 \end{center} |
611 \end{center} |
576 |
612 |
577 Given a finite set of regular expressions @{text rs}, we will make use of the operation of generating |
613 Given a finite set of regular expressions @{text rs}, we will make use of the operation of generating |
578 a regular expression that matches the union of all languages of @{text rs}. |
614 a regular expression that matches the union of all languages of @{text rs}. |
579 This definion is not trivial in a theorem prover, because @{text rs} (being a set) is unordered, |
615 This definition is not trivial in a theorem prover, because @{text rs} (being a set) is unordered, |
580 but the regular expression needs an order. Since |
616 but the regular expression needs an order. Since |
581 we only need to know the |
617 we only need to know the |
582 existence |
618 existence |
583 of such a regular expression, we can use Isabelle/HOL's @{const "fold_graph"} and Hilbert's |
619 of such a regular expression, we can use Isabelle/HOL's @{const "fold_graph"} and Hilbert's |
584 choice operator, written @{text "SOME"} in Isabelle/HOL, for defining @{term "\<Uplus>rs"}. |
620 choice operator, written @{text "SOME"} in Isabelle/HOL, for defining @{term "\<Uplus>rs"}. |
605 section {* The Myhill-Nerode Theorem, First Part *} |
641 section {* The Myhill-Nerode Theorem, First Part *} |
606 |
642 |
607 text {* |
643 text {* |
608 \noindent |
644 \noindent |
609 The key definition in the Myhill-Nerode Theorem is the |
645 The key definition in the Myhill-Nerode Theorem is the |
610 \emph{Myhill-Nerode Relation}, which states that w.r.t.~a language two |
646 \emph{Myhill-Nerode Relation}, which states that two |
611 strings are related, provided there is no distinguishing extension in this |
647 strings are related w.r.t.~a language, provided there is no distinguishing extension in this |
612 language. This can be defined as a ternary relation. |
648 language. This can be defined as a ternary relation. |
613 |
649 |
614 \begin{dfntn}[Myhill-Nerode Relation]\label{myhillneroderel} |
650 \begin{definition}[Myhill-Nerode Relation]\label{myhillneroderel} |
615 Given a language @{text A}, two strings @{text x} and |
651 Given a language @{text A}, two strings @{text x} and |
616 @{text y} are Myhill-Nerode related provided |
652 @{text y} are Myhill-Nerode related provided |
617 \begin{center} |
653 \begin{center} |
618 @{thm str_eq_def'} |
654 @{thm str_eq_def'} |
619 \end{center} |
655 \end{center} |
620 \end{dfntn} |
656 \end{definition} |
621 |
657 |
622 \noindent |
658 \noindent |
623 It is easy to see that @{term "\<approx>A"} is an equivalence relation, which |
659 It is easy to see that @{term "\<approx>A"} is an equivalence relation, which |
624 partitions the set of all strings, @{text "UNIV"}, into a set of disjoint |
660 partitions the set of all strings, @{text "UNIV"}, into a set of disjoint |
625 equivalence classes. To illustrate this quotient construction, let us give a simple |
661 equivalence classes. To illustrate this quotient construction, let us give a simple |
638 |
674 |
639 One direction of the Myhill-Nerode Theorem establishes |
675 One direction of the Myhill-Nerode Theorem establishes |
640 that if there are finitely many equivalence classes, like in the example above, then |
676 that if there are finitely many equivalence classes, like in the example above, then |
641 the language is regular. In our setting we therefore have to show: |
677 the language is regular. In our setting we therefore have to show: |
642 |
678 |
643 \begin{thrm}\label{myhillnerodeone} |
679 \begin{theorem}\label{myhillnerodeone} |
644 @{thm[mode=IfThen] Myhill_Nerode1} |
680 @{thm[mode=IfThen] Myhill_Nerode1} |
645 \end{thrm} |
681 \end{theorem} |
646 |
682 |
647 \noindent |
683 \noindent |
648 To prove this theorem, we first define the set @{term "finals A"} as those equivalence |
684 To prove this theorem, we first define the set @{term "finals A"} as those equivalence |
649 classes from @{term "UNIV // \<approx>A"} that contain strings of @{text A}, namely |
685 classes from \mbox{@{term "UNIV // \<approx>A"}} that contain strings of @{text A}, namely |
650 % |
686 % |
651 \begin{equation} |
687 \begin{equation} |
652 @{thm finals_def} |
688 @{thm finals_def} |
653 \end{equation} |
689 \end{equation} |
654 |
690 |
681 \end{equation} |
717 \end{equation} |
682 |
718 |
683 \noindent |
719 \noindent |
684 which means that if we append the character @{text c} to the end of all |
720 which means that if we append the character @{text c} to the end of all |
685 strings in the equivalence class @{text Y}, we obtain a subset of |
721 strings in the equivalence class @{text Y}, we obtain a subset of |
686 @{text X}. Note that we do not define an automaton here, we merely relate two sets |
722 @{text X}. |
687 (with the help of a character). In our concrete example we have |
723 %Note that we do not define formally an automaton here, |
|
724 %we merely relate two sets |
|
725 %(with the help of a character). |
|
726 In our concrete example we have |
688 @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<^isub>i\<Rightarrow> X\<^isub>3"} with @{text "d\<^isub>i"} being any |
727 @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<^isub>i\<Rightarrow> X\<^isub>3"} with @{text "d\<^isub>i"} being any |
689 other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>c\<^isub>j\<Rightarrow> X\<^isub>3"} for any |
728 other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>c\<^isub>j\<Rightarrow> X\<^isub>3"} for any |
690 character @{text "c\<^isub>j"}. |
729 character @{text "c\<^isub>j"}. |
691 |
730 |
692 Next we construct an \emph{initial equational system} that |
731 Next we construct an \emph{initial equational system} that |
704 @{text "X\<^isub>n"} & @{text "="} & @{text "(Y\<^isub>n\<^isub>1, ATOM c\<^isub>n\<^isub>1) + \<dots> + (Y\<^isub>n\<^isub>q, ATOM c\<^isub>n\<^isub>q)"}\\ |
743 @{text "X\<^isub>n"} & @{text "="} & @{text "(Y\<^isub>n\<^isub>1, ATOM c\<^isub>n\<^isub>1) + \<dots> + (Y\<^isub>n\<^isub>q, ATOM c\<^isub>n\<^isub>q)"}\\ |
705 \end{tabular} |
744 \end{tabular} |
706 \end{center} |
745 \end{center} |
707 |
746 |
708 \noindent |
747 \noindent |
709 where the terms @{text "(Y\<^isub>i\<^isub>j, ATOM c\<^isub>i\<^isub>j)"} are pairs consiting of an equivalence class and |
748 where the terms @{text "(Y\<^isub>i\<^isub>j, ATOM c\<^isub>i\<^isub>j)"} are pairs consisting of an equivalence class and |
710 a regular expression. In the initial equational system, they |
749 a regular expression. In the initial equational system, they |
711 stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow> |
750 stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow> |
712 X\<^isub>i"}. |
751 X\<^isub>i"}. |
713 %The intuition behind the equational system is that every |
752 %The intuition behind the equational system is that every |
714 %equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system |
753 %equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system |
794 \noindent |
833 \noindent |
795 Because we use sets of terms |
834 Because we use sets of terms |
796 for representing the right-hand sides of equations, we can |
835 for representing the right-hand sides of equations, we can |
797 prove \eqref{inv1} and \eqref{inv2} more concisely as |
836 prove \eqref{inv1} and \eqref{inv2} more concisely as |
798 % |
837 % |
799 \begin{lmm}\label{inv} |
838 \begin{lemma}\label{inv} |
800 If @{thm (prem 1) test} then @{text "X = \<Union> \<calL> ` rhs"}. |
839 If @{thm (prem 1) test} then @{text "X = \<Union> \<calL> ` rhs"}. |
801 \end{lmm} |
840 \end{lemma} |
802 |
841 |
803 \noindent |
842 \noindent |
804 Our proof of Theorem~\ref{myhillnerodeone} will proceed by transforming the |
843 Our proof of Theorem~\ref{myhillnerodeone} will proceed by transforming the |
805 initial equational system into one in \emph{solved form} maintaining the invariant |
844 initial equational system into one in \emph{solved form} maintaining the invariant |
806 in Lemma~\ref{inv}. From the solved form we will be able to read |
845 in Lemma~\ref{inv}. From the solved form we will be able to read |
874 \end{center} |
913 \end{center} |
875 |
914 |
876 \noindent |
915 \noindent |
877 This allows us to prove a version of Arden's Lemma on the level of equations. |
916 This allows us to prove a version of Arden's Lemma on the level of equations. |
878 |
917 |
879 \begin{lmm}\label{ardenable} |
918 \begin{lemma}\label{ardenable} |
880 Given an equation @{text "X = rhs"}. |
919 Given an equation @{text "X = rhs"}. |
881 If @{text "X = \<Union>\<calL> ` rhs"}, |
920 If @{text "X = \<Union>\<calL> ` rhs"}, |
882 @{thm (prem 2) Arden_preserves_soundness}, and |
921 @{thm (prem 2) Arden_preserves_soundness}, and |
883 @{thm (prem 3) Arden_preserves_soundness}, then |
922 @{thm (prem 3) Arden_preserves_soundness}, then |
884 @{text "X = \<Union>\<calL> ` (Arden X rhs)"}. |
923 @{text "X = \<Union>\<calL> ` (Arden X rhs)"}. |
885 \end{lmm} |
924 \end{lemma} |
886 |
925 |
887 \noindent |
926 \noindent |
888 Our @{text ardenable} condition is slightly stronger than needed for applying Arden's Lemma, |
927 Our @{text ardenable} condition is slightly stronger than needed for applying Arden's Lemma, |
889 but we can still ensure that it holds throughout our algorithm of transforming equations |
928 but we can still ensure that it holds throughout our algorithm of transforming equations |
890 into solved form. |
929 into solved form. |
1040 \end{proof} |
1079 \end{proof} |
1041 |
1080 |
1042 \noindent |
1081 \noindent |
1043 Next we show that @{text Iter} preserves the invariant. |
1082 Next we show that @{text Iter} preserves the invariant. |
1044 |
1083 |
1045 \begin{lmm}\label{iterone} If |
1084 \begin{lemma}\label{iterone} If |
1046 @{thm (prem 1) iteration_step_invariant[where xrhs="rhs"]}, |
1085 @{thm (prem 1) iteration_step_invariant[where xrhs="rhs"]}, |
1047 @{thm (prem 2) iteration_step_invariant[where xrhs="rhs"]} and |
1086 @{thm (prem 2) iteration_step_invariant[where xrhs="rhs"]} and |
1048 @{thm (prem 3) iteration_step_invariant[where xrhs="rhs"]}, then |
1087 @{thm (prem 3) iteration_step_invariant[where xrhs="rhs"]}, then |
1049 @{thm (concl) iteration_step_invariant[where xrhs="rhs"]}. |
1088 @{thm (concl) iteration_step_invariant[where xrhs="rhs"]}. |
1050 \end{lmm} |
1089 \end{lemma} |
1051 |
1090 |
1052 \begin{proof} |
1091 \begin{proof} |
1053 The argument boils down to choosing an equation @{text "Y = yrhs"} to be eliminated |
1092 The argument boils down to choosing an equation @{text "Y = yrhs"} to be eliminated |
1054 and to show that @{term "Subst_all (ES - {(Y, yrhs)}) Y (Arden Y yrhs)"} |
1093 and to show that @{term "Subst_all (ES - {(Y, yrhs)}) Y (Arden Y yrhs)"} |
1055 preserves the invariant. |
1094 preserves the invariant. |
1077 \end{proof} |
1116 \end{proof} |
1078 |
1117 |
1079 \noindent |
1118 \noindent |
1080 We also need the fact that @{text Iter} decreases the termination measure. |
1119 We also need the fact that @{text Iter} decreases the termination measure. |
1081 |
1120 |
1082 \begin{lmm}\label{itertwo} If |
1121 \begin{lemma}\label{itertwo} If |
1083 @{thm (prem 1) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}, |
1122 @{thm (prem 1) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}, |
1084 @{thm (prem 2) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]} and |
1123 @{thm (prem 2) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]} and |
1085 @{thm (prem 3) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}, then\\ |
1124 @{thm (prem 3) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}, then\\ |
1086 \mbox{@{thm (concl) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}}. |
1125 \mbox{@{thm (concl) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}}. |
1087 \end{lmm} |
1126 \end{lemma} |
1088 |
1127 |
1089 \begin{proof} |
1128 \begin{proof} |
1090 By assumption we know that @{text "ES"} is finite and has more than one element. |
1129 By assumption we know that @{text "ES"} is finite and has more than one element. |
1091 Therefore there must be an element @{term "(Y, yrhs) \<in> ES"} with |
1130 Therefore there must be an element @{term "(Y, yrhs) \<in> ES"} with |
1092 @{term "(Y, yrhs) \<noteq> (X, rhs)"}. Using the distinctness property we can infer |
1131 @{term "(Y, yrhs) \<noteq> (X, rhs)"}. Using the distinctness property we can infer |
1097 |
1136 |
1098 \noindent |
1137 \noindent |
1099 This brings us to our property we want to establish for @{text Solve}. |
1138 This brings us to our property we want to establish for @{text Solve}. |
1100 |
1139 |
1101 |
1140 |
1102 \begin{lmm} |
1141 \begin{lemma} |
1103 If @{thm (prem 1) Solve} and @{thm (prem 2) Solve} then there exists |
1142 If @{thm (prem 1) Solve} and @{thm (prem 2) Solve} then there exists |
1104 a @{text rhs} such that @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"} |
1143 a @{text rhs} such that @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"} |
1105 and @{term "invariant {(X, rhs)}"}. |
1144 and @{term "invariant {(X, rhs)}"}. |
1106 \end{lmm} |
1145 \end{lemma} |
1107 |
1146 |
1108 \begin{proof} |
1147 \begin{proof} |
1109 In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly |
1148 In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly |
1110 stronger invariant since Lemma~\ref{iterone} and \ref{itertwo} have the precondition |
1149 stronger invariant since Lemma~\ref{iterone} and \ref{itertwo} have the precondition |
1111 that @{term "(X, rhs) \<in> ES"} for some @{text rhs}. This precondition is needed |
1150 that @{term "(X, rhs) \<in> ES"} for some @{text rhs}. This precondition is needed |
1117 Premise 2 is given by Lemma~\ref{iterone} and the fact that @{const Iter} might |
1156 Premise 2 is given by Lemma~\ref{iterone} and the fact that @{const Iter} might |
1118 modify the @{text rhs} in the equation @{term "X = rhs"}, but does not remove it. |
1157 modify the @{text rhs} in the equation @{term "X = rhs"}, but does not remove it. |
1119 Premise 3 of~\eqref{whileprinciple} is by Lemma~\ref{itertwo}. Now in premise 4 |
1158 Premise 3 of~\eqref{whileprinciple} is by Lemma~\ref{itertwo}. Now in premise 4 |
1120 we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"} |
1159 we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"} |
1121 and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"} |
1160 and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"} |
1122 does not holds. By the stronger invariant we know there exists such a @{text "rhs"} |
1161 does not hold. By the stronger invariant we know there exists such a @{text "rhs"} |
1123 with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality |
1162 with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality |
1124 of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"}, |
1163 of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"}, |
1125 for which the invariant holds. This allows us to conclude that |
1164 for which the invariant holds. This allows us to conclude that |
1126 @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"} and @{term "invariant {(X, rhs)}"} hold, |
1165 @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"} and @{term "invariant {(X, rhs)}"} hold, |
1127 as needed. |
1166 as needed. |
1129 |
1168 |
1130 \noindent |
1169 \noindent |
1131 With this lemma in place we can show that for every equivalence class in @{term "UNIV // \<approx>A"} |
1170 With this lemma in place we can show that for every equivalence class in @{term "UNIV // \<approx>A"} |
1132 there exists a regular expression. |
1171 there exists a regular expression. |
1133 |
1172 |
1134 \begin{lmm}\label{every_eqcl_has_reg} |
1173 \begin{lemma}\label{every_eqcl_has_reg} |
1135 @{thm[mode=IfThen] every_eqcl_has_reg} |
1174 @{thm[mode=IfThen] every_eqcl_has_reg} |
1136 \end{lmm} |
1175 \end{lemma} |
1137 |
1176 |
1138 \begin{proof} |
1177 \begin{proof} |
1139 By the preceding lemma, we know that there exists a @{text "rhs"} such |
1178 By the preceding lemma, we know that there exists a @{text "rhs"} such |
1140 that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"}, |
1179 that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"}, |
1141 and that the invariant holds for this equation. That means we |
1180 and that the invariant holds for this equation. That means we |
1248 The relation @{term "\<approx>(lang r)"} partitions the set of all strings, @{term UNIV}, into some |
1287 The relation @{term "\<approx>(lang r)"} partitions the set of all strings, @{term UNIV}, into some |
1249 equivalence classes. To show that there are only finitely many of them, it |
1288 equivalence classes. To show that there are only finitely many of them, it |
1250 suffices to show in each induction step that another relation, say @{text |
1289 suffices to show in each induction step that another relation, say @{text |
1251 R}, has finitely many equivalence classes and refines @{term "\<approx>(lang r)"}. |
1290 R}, has finitely many equivalence classes and refines @{term "\<approx>(lang r)"}. |
1252 |
1291 |
1253 \begin{dfntn} |
1292 \begin{definition} |
1254 A relation @{text "R\<^isub>1"} \emph{refines} @{text "R\<^isub>2"} |
1293 A relation @{text "R\<^isub>1"} \emph{refines} @{text "R\<^isub>2"} |
1255 provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}. |
1294 provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}. |
1256 \end{dfntn} |
1295 \end{definition} |
1257 |
1296 |
1258 \noindent |
1297 \noindent |
1259 For constructing @{text R}, we will rely on some \emph{tagging-functions} |
1298 For constructing @{text R}, we will rely on some \emph{tagging-functions} |
1260 defined over strings. Given the inductive hypothesis, it will be easy to |
1299 defined over strings. Given the inductive hypothesis, it will be easy to |
1261 prove that the \emph{range} of these tagging-functions is finite. The range |
1300 prove that the \emph{range} of these tagging-functions is finite. The range |
1273 @{term "\<approx>(lang r)"}, which implies that @{term "UNIV // \<approx>(lang r)"} must |
1312 @{term "\<approx>(lang r)"}, which implies that @{term "UNIV // \<approx>(lang r)"} must |
1274 also be finite. We formally define the notion of a \emph{tagging-relation} |
1313 also be finite. We formally define the notion of a \emph{tagging-relation} |
1275 as follows. |
1314 as follows. |
1276 |
1315 |
1277 |
1316 |
1278 \begin{dfntn}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x} |
1317 \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x} |
1279 and @{text y} are \emph{tag-related} provided |
1318 and @{text y} are \emph{tag-related} provided |
1280 \begin{center} |
1319 \begin{center} |
1281 @{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y \<equiv> tag x = tag y"}\;. |
1320 @{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y \<equiv> tag x = tag y"}\;. |
1282 \end{center} |
1321 \end{center} |
1283 \end{dfntn} |
1322 \end{definition} |
1284 |
1323 |
1285 |
1324 |
1286 In order to establish finiteness of a set @{text A}, we shall use the following powerful |
1325 In order to establish finiteness of a set @{text A}, we shall use the following powerful |
1287 principle from Isabelle/HOL's library. |
1326 principle from Isabelle/HOL's library. |
1288 % |
1327 % |
1293 \noindent |
1332 \noindent |
1294 It states that if an image of a set under an injective function @{text f} (injective over this set) |
1333 It states that if an image of a set under an injective function @{text f} (injective over this set) |
1295 is finite, then the set @{text A} itself must be finite. We can use it to establish the following |
1334 is finite, then the set @{text A} itself must be finite. We can use it to establish the following |
1296 two lemmas. |
1335 two lemmas. |
1297 |
1336 |
1298 \begin{lmm}\label{finone} |
1337 \begin{lemma}\label{finone} |
1299 @{thm[mode=IfThen] finite_eq_tag_rel} |
1338 @{thm[mode=IfThen] finite_eq_tag_rel} |
1300 \end{lmm} |
1339 \end{lemma} |
1301 |
1340 |
1302 \begin{proof} |
1341 \begin{proof} |
1303 We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have |
1342 We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have |
1304 @{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be |
1343 @{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be |
1305 finite by assumption. Now @{term "f ` (UNIV // =tag=)"} is a subset of @{text "range f"}, |
1344 finite by assumption. Now @{term "f ` (UNIV // =tag=)"} is a subset of @{text "range f"}, |
1310 turn means that the equivalence classes @{text X} |
1349 turn means that the equivalence classes @{text X} |
1311 and @{text Y} must be equal. Therefore \eqref{finiteimageD} allows us to conclude |
1350 and @{text Y} must be equal. Therefore \eqref{finiteimageD} allows us to conclude |
1312 with @{thm (concl) finite_eq_tag_rel}. |
1351 with @{thm (concl) finite_eq_tag_rel}. |
1313 \end{proof} |
1352 \end{proof} |
1314 |
1353 |
1315 \begin{lmm}\label{fintwo} |
1354 \begin{lemma}\label{fintwo} |
1316 Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby |
1355 Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby |
1317 @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}. |
1356 @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}. |
1318 If @{thm (prem 1) refined_partition_finite[where ?R1.0="R\<^isub>1" and ?R2.0="R\<^isub>2"]} |
1357 If @{thm (prem 1) refined_partition_finite[where ?R1.0="R\<^isub>1" and ?R2.0="R\<^isub>2"]} |
1319 then @{thm (concl) refined_partition_finite[where ?R1.0="R\<^isub>1" and ?R2.0="R\<^isub>2"]}. |
1358 then @{thm (concl) refined_partition_finite[where ?R1.0="R\<^isub>1" and ?R2.0="R\<^isub>2"]}. |
1320 \end{lmm} |
1359 \end{lemma} |
1321 |
1360 |
1322 \begin{proof} |
1361 \begin{proof} |
1323 We prove this lemma again using \eqref{finiteimageD}. This time we set @{text f} to |
1362 We prove this lemma again using \eqref{finiteimageD}. This time we set @{text f} to |
1324 be @{text "X \<mapsto>"}~@{term "{R\<^isub>1 `` {x} | x. x \<in> X}"}. It is easy to see that |
1363 be @{text "X \<mapsto>"}~@{term "{R\<^isub>1 `` {x} | x. x \<in> X}"}. It is easy to see that |
1325 @{term "finite (f ` (UNIV // R\<^isub>2))"} because it is a subset of @{term "Pow (UNIV // R\<^isub>1)"}, |
1364 @{term "finite (f ` (UNIV // R\<^isub>2))"} because it is a subset of @{term "Pow (UNIV // R\<^isub>1)"}, |
1366 |
1405 |
1367 \noindent |
1406 \noindent |
1368 The @{const TIMES}-case is slightly more complicated. We first prove the |
1407 The @{const TIMES}-case is slightly more complicated. We first prove the |
1369 following lemma, which will aid the proof about refinement. |
1408 following lemma, which will aid the proof about refinement. |
1370 |
1409 |
1371 \begin{lmm}\label{refinement} |
1410 \begin{lemma}\label{refinement} |
1372 The relation @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} refines @{term "\<approx>A"}, provided for |
1411 The relation @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} refines @{term "\<approx>A"}, provided for |
1373 all strings @{text x}, @{text y} and @{text z} we have that \mbox{@{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y"}} |
1412 all strings @{text x}, @{text y} and @{text z} we have that \mbox{@{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y"}} |
1374 and @{term "x @ z \<in> A"} imply @{text "y @ z \<in> A"}. |
1413 and @{term "x @ z \<in> A"} imply @{text "y @ z \<in> A"}. |
1375 \end{lmm} |
1414 \end{lemma} |
1376 |
1415 |
1377 |
1416 |
1378 \noindent |
1417 \noindent |
1379 We therefore can analyse how the strings @{text "x @ z"} are in the language |
1418 We therefore can analyse how the strings @{text "x @ z"} are in the language |
1380 @{text A} and then construct an appropriate tagging-function to infer that |
1419 @{text A} and then construct an appropriate tagging-function to infer that |
1450 node[midway, below=0.5em]{@{term "x\<^isub>p \<in> A"}}; |
1491 node[midway, below=0.5em]{@{term "x\<^isub>p \<in> A"}}; |
1451 \end{tikzpicture}} |
1492 \end{tikzpicture}} |
1452 \end{tabular} |
1493 \end{tabular} |
1453 \end{center} |
1494 \end{center} |
1454 % |
1495 % |
|
1496 |
1455 \noindent |
1497 \noindent |
1456 Either @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} |
1498 Either @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} |
1457 (first picture) or there is a prefix of @{text x} in @{text A} and the rest is in @{text B} |
1499 (first picture) or there is a prefix of @{text x} in @{text A} and the rest is in @{text B} |
1458 (second picture). In both cases we have to show that @{term "y @ z \<in> A \<cdot> B"}. The first case |
1500 (second picture). In both cases we have to show that @{term "y @ z \<in> A \<cdot> B"}. The first case |
1459 we will only go through if we know that @{term "x \<approx>A y"} holds @{text "(*)"}. Because then |
1501 we will only go through if we know that @{term "x \<approx>A y"} holds @{text "("}@{text "*"}@{text ")"}. |
|
1502 Because then |
1460 we can infer from @{term "x @ z\<^isub>p \<in> A"} that @{term "y @ z\<^isub>p \<in> A"} holds for all @{text "z\<^isub>p"}. |
1503 we can infer from @{term "x @ z\<^isub>p \<in> A"} that @{term "y @ z\<^isub>p \<in> A"} holds for all @{text "z\<^isub>p"}. |
1461 In the second case we only know that @{text "x\<^isub>p"} and @{text "x\<^isub>s"} is one possible partition |
1504 In the second case we only know that @{text "x\<^isub>p"} and @{text "x\<^isub>s"} is one possible partition |
1462 of the string @{text x}. We have to know that both @{text "x\<^isub>p"} and the |
1505 of the string @{text x}. We have to know that both @{text "x\<^isub>p"} and the |
1463 corresponding partition @{text "y\<^isub>p"} are in @{text "A"}, and that @{text "x\<^isub>s"} is `@{text B}-related' |
1506 corresponding partition @{text "y\<^isub>p"} are in @{text "A"}, and that @{text "x\<^isub>s"} is `@{text B}-related' |
1464 to @{text "y\<^isub>s"} @{text "(**)"}. From the latter fact we can infer that @{text "y\<^isub>s @ z \<in> B"}. |
1507 to @{text "y\<^isub>s"} @{text "("}@{text "**"}@{text ")"}. From the latter fact we can infer that @{text "y\<^isub>s @ z \<in> B"}. |
1465 This will solve the second case. |
1508 This will solve the second case. |
1466 Taking the two requirements, @{text "(*)"} and @{text "(**)"}, together we define the |
1509 Taking the two requirements, @{text "("}@{text "*"}@{text ")"} and @{text "(**)"}, together we define the |
1467 tagging-function in the @{const Times}-case as: |
1510 tagging-function in the @{const Times}-case as: |
1468 |
1511 |
1469 \begin{center} |
1512 \begin{center} |
1470 @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\<equiv>"}~ |
1513 @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\<equiv>"}~ |
1471 @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"} |
1514 @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"} |
1473 |
1516 |
1474 \noindent |
1517 \noindent |
1475 Note that we have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do |
1518 Note that we have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do |
1476 not know anything about how the string @{term x} is partitioned. |
1519 not know anything about how the string @{term x} is partitioned. |
1477 With this definition in place, let us prove the @{const "Times"}-case. |
1520 With this definition in place, let us prove the @{const "Times"}-case. |
1478 |
|
1479 |
1521 |
1480 \begin{proof}[@{const TIMES}-Case] |
1522 \begin{proof}[@{const TIMES}-Case] |
1481 If @{term "finite (UNIV // \<approx>A)"} and @{term "finite (UNIV // \<approx>B)"} |
1523 If @{term "finite (UNIV // \<approx>A)"} and @{term "finite (UNIV // \<approx>B)"} |
1482 then @{term "finite ((UNIV // \<approx>A) \<times> (Pow (UNIV // \<approx>B)))"} holds. The range of |
1524 then @{term "finite ((UNIV // \<approx>A) \<times> (Pow (UNIV // \<approx>B)))"} holds. The range of |
1483 @{term "tag_Times A B"} is a subset of this product set, and therefore finite. |
1525 @{term "tag_Times A B"} is a subset of this product set, and therefore finite. |
1882 modify Antimirov's proof by performing an induction on @{text s} where we |
1924 modify Antimirov's proof by performing an induction on @{text s} where we |
1883 generalise over all @{text r}. That means in the @{text "cons"}-case the |
1925 generalise over all @{text r}. That means in the @{text "cons"}-case the |
1884 induction hypothesis is |
1926 induction hypothesis is |
1885 |
1927 |
1886 \begin{center} |
1928 \begin{center} |
1887 @{text "(IH)"}\hspace{3mm}@{term "\<forall>r. Derivs s (lang r) = \<Union> lang ` (pderivs s r)"} |
1929 @{text "(IH)"}\hspace{3mm}@{term "\<forall>r. Derivs s (lang r) = \<Union> (lang ` (pderivs s r))"} |
1888 \end{center} |
1930 \end{center} |
1889 |
1931 |
1890 \noindent |
1932 \noindent |
1891 With this we can establish |
1933 With this we can establish |
1892 |
1934 |
1893 \begin{center} |
1935 \begin{center} |
1894 \begin{tabular}{r@ {\hspace{1.5mm}}c@ {\hspace{1.5mm}}ll} |
1936 \begin{tabular}{r@ {\hspace{1.5mm}}c@ {\hspace{1.5mm}}ll} |
1895 @{term "Derivs (c # s) (lang r)"} |
1937 @{term "Derivs (c # s) (lang r)"} |
1896 & @{text "="} & @{term "Derivs s (Deriv c (lang r))"} & by def.\\ |
1938 & @{text "="} & @{term "Derivs s (Deriv c (lang r))"} & by def.\\ |
1897 & @{text "="} & @{term "Derivs s (\<Union> lang ` (pderiv c r))"} & by @{text "("}\ref{Derspders}@{text ".i)"}\\ |
1939 & @{text "="} & @{term "Derivs s (\<Union> (lang ` (pderiv c r)))"} & by @{text "("}\ref{Derspders}@{text ".i)"}\\ |
1898 & @{text "="} & @{term "\<Union> (Derivs s) ` (lang ` (pderiv c r))"} & by def.~of @{text "Ders"}\\ |
1940 & @{text "="} & @{term "\<Union> ((Derivs s) ` (lang ` (pderiv c r)))"} & by def.~of @{text "Ders"}\\ |
1899 & @{text "="} & @{term "\<Union> lang ` (\<Union> pderivs s ` (pderiv c r))"} & by IH\\ |
1941 & @{text "="} & @{term "\<Union> (lang ` (\<Union> (pderivs s ` (pderiv c r))))"} & by IH\\ |
1900 & @{text "="} & @{term "\<Union> lang ` (pderivs (c # s) r)"} & by def.\\ |
1942 & @{text "="} & @{term "\<Union> (lang ` (pderivs (c # s) r))"} & by def.\\ |
1901 \end{tabular} |
1943 \end{tabular} |
1902 \end{center} |
1944 \end{center} |
1903 |
1945 |
1904 \noindent |
1946 \noindent |
1905 Note that in order to apply the induction hypothesis in the fourth equation, we |
1947 Note that in order to apply the induction hypothesis in the fourth equation, we |
2099 that @{term "pderivs_lang B r"} is finite for every language @{text B} and |
2141 that @{term "pderivs_lang B r"} is finite for every language @{text B} and |
2100 regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition |
2142 regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition |
2101 and \eqref{Derspders} we have |
2143 and \eqref{Derspders} we have |
2102 % |
2144 % |
2103 \begin{equation}\label{eqq} |
2145 \begin{equation}\label{eqq} |
2104 @{term "Deriv_lang B (lang r) = (\<Union> lang ` (pderivs_lang B r))"} |
2146 @{term "Deriv_lang B (lang r) = (\<Union> (lang ` (pderivs_lang B r)))"} |
2105 \end{equation} |
2147 \end{equation} |
2106 |
2148 |
2107 \noindent |
2149 \noindent |
2108 Since there are only finitely many regular expressions in @{term "pderivs_lang |
2150 Since there are only finitely many regular expressions in @{term "pderivs_lang |
2109 B r"}, we know by \eqref{uplus} that there exists a regular expression so that |
2151 B r"}, we know by \eqref{uplus} that there exists a regular expression so that |
2180 |
2222 |
2181 \noindent |
2223 \noindent |
2182 whereby the last equation follows from the fact that @{term "A\<star>"} contains the |
2224 whereby the last equation follows from the fact that @{term "A\<star>"} contains the |
2183 empty string. With these properties at our disposal we can establish the lemma |
2225 empty string. With these properties at our disposal we can establish the lemma |
2184 |
2226 |
2185 \begin{lmm} |
2227 \begin{lemma} |
2186 If @{text A} is regular, then also @{term "SUPSEQ A"}. |
2228 If @{text A} is regular, then also @{term "SUPSEQ A"}. |
2187 \end{lmm} |
2229 \end{lemma} |
2188 |
2230 |
2189 \begin{proof} |
2231 \begin{proof} |
2190 Since our alphabet is finite, we have a regular expression, written @{text ALL}, that |
2232 Since our alphabet is finite, we have a regular expression, written @{text ALL}, that |
2191 matches every string. Using this regular expression we can inductively define |
2233 matches every string. Using this regular expression we can inductively define |
2192 the operation @{text "r\<up>"} |
2234 the operation @{text "r\<up>"} |
2273 |
2315 |
2274 Finally we like to show that the Myhill-Nerode Theorem is also convenient for establishing |
2316 Finally we like to show that the Myhill-Nerode Theorem is also convenient for establishing |
2275 the non-regularity of languages. For this we use the following version of the Continuation |
2317 the non-regularity of languages. For this we use the following version of the Continuation |
2276 Lemma (see for example~\cite{Rosenberg06}). |
2318 Lemma (see for example~\cite{Rosenberg06}). |
2277 |
2319 |
2278 \begin{lmm}[Continuation Lemma] |
2320 \begin{lemma}[Continuation Lemma] |
2279 If a language @{text A} is regular and a set of strings @{text B} is infinite, |
2321 If a language @{text A} is regular and a set of strings @{text B} is infinite, |
2280 then there exist two distinct strings @{text x} and @{text y} in @{text B} |
2322 then there exist two distinct strings @{text x} and @{text y} in @{text B} |
2281 such that @{term "x \<approx>A y"}. |
2323 such that @{term "x \<approx>A y"}. |
2282 \end{lmm} |
2324 \end{lemma} |
2283 |
2325 |
2284 \noindent |
2326 \noindent |
2285 This lemma can be easily deduced from the Myhill-Nerode Theorem and the Pigeonhole |
2327 This lemma can be easily deduced from the Myhill-Nerode Theorem and the Pigeonhole |
2286 Principle: Since @{text A} is regular, there can be only finitely many |
2328 Principle: Since @{text A} is regular, there can be only finitely many |
2287 equivalence classes. Hence an infinite set must contain |
2329 equivalence classes. Hence an infinite set must contain |
2291 Using this lemma, it is straightforward to establish that the language |
2333 Using this lemma, it is straightforward to establish that the language |
2292 \mbox{@{text "A \<equiv> \<Union>\<^isub>n a\<^sup>n @ b\<^sup>n"}} is not regular (@{text "a\<^sup>n"} stands |
2334 \mbox{@{text "A \<equiv> \<Union>\<^isub>n a\<^sup>n @ b\<^sup>n"}} is not regular (@{text "a\<^sup>n"} stands |
2293 for the strings consisting of @{text n} times the character a; similarly for |
2335 for the strings consisting of @{text n} times the character a; similarly for |
2294 @{text "b\<^isup>n"}). For this consider the infinite set @{text "B \<equiv> \<Union>\<^isub>n a\<^sup>n"}. |
2336 @{text "b\<^isup>n"}). For this consider the infinite set @{text "B \<equiv> \<Union>\<^isub>n a\<^sup>n"}. |
2295 |
2337 |
2296 \begin{lmm} |
2338 \begin{lemma} |
2297 No two distinct strings in set @{text "B"} are Myhill-Nerode related by language @{text A}. |
2339 No two distinct strings in set @{text "B"} are Myhill-Nerode related by language @{text A}. |
2298 \end{lmm} |
2340 \end{lemma} |
2299 |
2341 |
2300 \begin{proof} |
2342 \begin{proof} |
2301 After unfolding the definition of @{text "B"}, we need to establish that given @{term "i \<noteq> j"}, |
2343 After unfolding the definition of @{text "B"}, we need to establish that given @{term "i \<noteq> j"}, |
2302 the strings @{text "a\<^sup>i"} and @{text "a\<^sup>j"} are not Myhill-Nerode related by @{text "A"}. |
2344 the strings @{text "a\<^sup>i"} and @{text "a\<^sup>j"} are not Myhill-Nerode related by @{text "A"}. |
2303 That means we have to show that @{text "\<forall>z. a\<^sup>i @ z \<in> A = a\<^sup>j @ z \<in> A"} leads to |
2345 That means we have to show that @{text "\<forall>z. a\<^sup>i @ z \<in> A = a\<^sup>j @ z \<in> A"} leads to |
2450 Theorem~\ref{antimirov}). However, since partial derivatives use sets of |
2492 Theorem~\ref{antimirov}). However, since partial derivatives use sets of |
2451 regular expressions, one needs to carefully analyse whether the resulting |
2493 regular expressions, one needs to carefully analyse whether the resulting |
2452 algorithm is still executable. Given the infrastructure for |
2494 algorithm is still executable. Given the infrastructure for |
2453 executable sets introduced by \citeN{Haftmann09} in Isabelle/HOL, it should. |
2495 executable sets introduced by \citeN{Haftmann09} in Isabelle/HOL, it should. |
2454 |
2496 |
2455 We started out by claiming that in a theorem prover it is eaiser to |
2497 We started out by claiming that in a theorem prover it is easier to |
2456 reason about regular expressions than about automta. Here are some |
2498 reason about regular expressions than about automata. Here are some |
2457 numbers: Our formalisation of the Myhill-Nerode Theorem consists of |
2499 numbers: Our formalisation of the Myhill-Nerode Theorem consists of |
2458 780 lines of Isabelle/Isar code for the first direction and 460 for |
2500 780 lines of Isabelle/Isar code for the first direction and 460 for |
2459 the second (the one based on tagging-functions), plus around 300 |
2501 the second (the one based on tagging-functions), plus around 300 |
2460 lines of standard material about regular languages. The |
2502 lines of standard material about regular languages. The |
2461 formalisation of derivatives and partial derivatives shown in |
2503 formalisation of derivatives and partial derivatives shown in |
2534 \noindent |
2576 \noindent |
2535 {\bf Acknowledgements:} |
2577 {\bf Acknowledgements:} |
2536 We are grateful for the comments we received from Larry Paulson. Tobias |
2578 We are grateful for the comments we received from Larry Paulson. Tobias |
2537 Nipkow made us aware of the properties in Theorem~\ref{subseqreg} and Tjark |
2579 Nipkow made us aware of the properties in Theorem~\ref{subseqreg} and Tjark |
2538 Weber helped us with proving them. Christian Sternagel provided us with a |
2580 Weber helped us with proving them. Christian Sternagel provided us with a |
2539 version of Higman's Lemma that applies to arbtrary, but finite alphabets. |
2581 version of Higman's Lemma that applies to arbitrary, but finite alphabets. |
2540 |
2582 |
2541 \bibliographystyle{acmtrans} |
2583 \bibliographystyle{acmtrans} |
2542 \bibliography{root} |
2584 \bibliography{root} |
2543 |
2585 |
2544 \newpage |
2586 %\newpage |
2545 \begin{appendix} |
2587 %\begin{appendix} |
2546 \section{Appendix$^\star$} |
2588 %\section{Appendix$^\star$} |
2547 |
2589 |
2548 \renewcommand{\thefootnote}{\mbox{$\star$}} |
2590 %\renewcommand{\thefootnote}{\mbox{$\star$}} |
2549 \footnotetext{If the reviewers deem more suitable, the authors are |
2591 %\footnotetext{If the reviewers deem more suitable, the authors are |
2550 prepared to drop material or move it to an electronic appendix.} |
2592 %prepared to drop material or move it to an electronic appendix.} |
2551 |
2593 |
2552 \begin{proof}[of Lemma~\ref{arden}] |
2594 %\begin{proof}[of Lemma~\ref{arden}] |
2553 For the right-to-left direction we assume @{thm (rhs) reversed_Arden} and show |
2595 %For the right-to-left direction we assume @{thm (rhs) reversed_Arden} and show |
2554 that @{thm (lhs) reversed_Arden} holds. From Property~\ref{langprops}@{text "(i)"} |
2596 %that @{thm (lhs) reversed_Arden} holds. From Property~\ref{langprops}@{text "(i)"} |
2555 we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"}, |
2597 %we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"}, |
2556 which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both |
2598 %which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both |
2557 sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side |
2599 %sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side |
2558 is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. Applying the assumed equation |
2600 %is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. Applying the assumed equation |
2559 completes this direction. |
2601 %completes this direction. |
2560 |
2602 |
2561 For the other direction we assume @{thm (lhs) reversed_Arden}. By a simple induction |
2603 %For the other direction we assume @{thm (lhs) reversed_Arden}. By a simple induction |
2562 on @{text n}, we can establish the property |
2604 %on @{text n}, we can establish the property |
2563 |
2605 |
2564 \begin{center} |
2606 %\begin{center} |
2565 @{text "(*)"}\hspace{5mm} @{thm (concl) reversed_arden_helper} |
2607 %@{text "("}@{text "*"}@{text ")"}\hspace{5mm} @{thm (concl) reversed_arden_helper} |
2566 \end{center} |
2608 %\end{center} |
2567 |
2609 |
2568 \noindent |
2610 %\noindent |
2569 Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for |
2611 %Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for |
2570 all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition |
2612 %all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition |
2571 of @{text "\<star>"}. |
2613 %of @{text "\<star>"}. |
2572 For the inclusion in the other direction we assume a string @{text s} |
2614 %For the inclusion in the other direction we assume a string @{text s} |
2573 with length @{text k} is an element in @{text X}. Since @{thm (prem 1) reversed_Arden} |
2615 %with length @{text k} is an element in @{text X}. Since @{thm (prem 1) reversed_Arden} |
2574 we know by Property~\ref{langprops}@{text "(ii)"} that |
2616 %we know by Property~\ref{langprops}@{text "(ii)"} that |
2575 @{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k} |
2617 %@{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k} |
2576 (the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer). |
2618 %(the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer). |
2577 From @{text "(*)"} it follows then that |
2619 %From @{text "("}@{text "*"}@{text ")"} it follows then that |
2578 @{term s} must be an element in @{term "(\<Union>m\<le>k. B \<cdot> (A \<up> m))"}. This in turn |
2620 %@{term s} must be an element in @{term "(\<Union>m\<le>k. B \<cdot> (A \<up> m))"}. This in turn |
2579 implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Property~\ref{langprops}@{text "(iii)"} |
2621 %implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Property~\ref{langprops}@{text "(iii)"} |
2580 this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show. |
2622 %this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show. |
2581 \end{proof} |
2623 %\end{proof} |
2582 \end{appendix} |
2624 % \end{appendix} |
2583 *} |
2625 *} |
2584 |
2626 |
2585 |
2627 |
2586 (*<*) |
2628 (*<*) |
2587 end |
2629 end |