# HG changeset patch # User Christian Urban # Date 1373474850 -3600 # Node ID 99161cd17c0fcb51c3f945720fd3f08dc9153f0f # Parent c028b07a4fa81df74cdd9389037c7c3b880c2f81 added modified version adn answer to the reviewers diff -r c028b07a4fa8 -r 99161cd17c0f Journal/JAR-response --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/Journal/JAR-response Wed Jul 10 17:47:30 2013 +0100 @@ -0,0 +1,474 @@ +We thank the reviewers for their time and insightful comments. +We have incorporated most comments and if not, then answered them +below. We also spell-checked the paper. Thanks again for all the +comments. + +Reviewer #1 +=========== + +> Indeed, one of the premise of the paper is that representing automata +> in a theorem prover requires to fight against the theorem prover or +> requires painful amount of details. Therefore, the authors define the +> notion of regular languages as a language generated by a regular +> expression. (This is not an unusual choice from a formalization point +> of view, provided complemented by a proof of Kleene's theorem that +> states such regular languages are exactly the one that are recognized +> by finite automata.) + +We concur with the reviewer that our definition of regular languages +as the ones generated by regular expressions is usual, but the paper is +unusual in that we do not need to provide a version of Kleene's theorem. +All results are obtained without formally needing a definition for automata. +If we would need Kleene's theorem as part of our arguments, then we would +need such a definition. + +To make our intention with the paper more clear, we moved the quote +by Gasarch from the Conclusion to the Introduction. In this quote he asks +whether the complementation of regular-expression languages can be proved +without using automata. In this way we sharpen our motivation by being +theorem prover related, but also of general interest. See quotations and +changes right after Definition 2, p. 3. + +> Second, the premise of this article is that formalizing automata +> theory require some fight with the theorem prover, and that several +> other (longer) formalizations from the related work substantiate this +> claim. I am under the impression that this comparison is misleading +> since some of the related work deals not only with regular language +> \emph{theory}, but also regular language \emph{algorithms}: e.g., +> automata constructions, conversion from NFA to DFA, minimization, +> decision procedures for language equivalence. These algorithms may be +> proved correct, complete, or terminating; and may be more or less +> efficient. Then, these properties dictate some choices of +> representation of the objects of the formalization (and hence have +> consequences on its intricacies). + +Yes, we agree with the reviewer that *if* one is in the business of +proving the correctness of, for example, Hopcroft's algorithm or the +NFA-to-DFA conversion, which are based on "concrete" automata, then yes +one clearly has to use the usual notion of the automata somewhere, even +might have to start from it. + +However, we disagree with the interpretation that our comparison is +misleading. Our claim that usual `paper' definitions do not translate +smoothly into formalised definitions is not a claim just by us and +independent of any numbers: We referenced in the paper to a concrete +problem pointed out already in Slind's email cited on the bottom of +p. 2. In the context of HOL4, he made the observation + + | A student who is attempting to prove the pumping lemma for regular + | languages came across the following problem. Suppose a DFA is + | represented by a record <| init ; trans ; accept |> where + | + | init : 'state + | trans : 'state -> 'symbol -> 'state + | accept : 'state -> bool + | + | Execution of a DFA on a string: + | + | (Execute machine [] s = s) + | (Execute machine (h::t) s = Execute machine t (machine.trans s h)) + | + | Language recognized by a machine D: + | + | Lang(D) = {x | D.accept (Execute D x D.init)} + | + | These are all accepted by HOL with no problem. In then trying to define + | the set of regular languages, namely those accepted by some DFA, the + | following is natural to try + | + | Regular (L:'symbol list -> bool) = ?D. L = Lang(D) + | + | But this fails, since D has two type variables ('state and 'symbol) but + | Regular is a predicate on 'symbol lists. So the principle of definition + | rejects. + +We also pointed to comments by Tobias Nipkow which clearly show that +he had to fight the theorem prover. Many formalisations we cited for +comparison contain statements about having to fight their respective +theorem prover. + +The conclusion we want the reader to draw from our comparisons is +that if one is just interested in formalising results from regular language +theory, not results from automata theory, then regular expressions are +easier to work with formally. If you look at the numbers (especially the +ones for Isabelle/HOL), then we hope you agree with us that the startup +costs are much higher with automata in comparison with regular +expressions. We carefully commented on the examples we have cited in +the Conclusion. Nowhere did we claim that they are one-to-one comparisons with +our development. We did, for example, say that Braibant agreed with us that +*if* he had done just Myhill-Nerode using his library, he estimates our +development is more concise (see his comment in the Conclusion, p. 25). + +> In this article, the authors choose +> to do the converse: avoid the notion of automata altogether, and see +> how far it goes. I find this choice a bit peculiar. + +At first sight, our choice/approach might seem peculiar, since +we are not primarily verifying an algorithm. However, we are not +the only ones who pondered about this approach (we mentioned Gasarch). +We showed that this is indeed a choice that can lead you quite far +with regular language theory and lead to a relatively smooth +formalisation. We did not make the claim that regular expressions are +a substitute for automata in all situations. See our paragraph +in the Conclusion starting with "While regular expressions are +convenient, they have some limitations." + +> Third, I wonder to what extent some of the proofs presented in this +> article are a rephrasing of standard proofs (that go through +> automata), inlining the notion of automata, e.g., the construction +> depicted in Section 3. Also, the notions of derivatives and partial +> derivatives underly some automata constructions: for instance, from my +> point of view, Section 5 is actually building a DFA (with transition +> relation $\delta$ and initial state ${r}$)from a regular expression +> without saying so; and it suffices to take as a tag function the +> expression $\lambda x.\delta({r},x)$ (which is a right invariant +> equivalence relation between strings, and of finite index since there +> finitely many states in the automaton). The authors take care to note +> the proximity of their arguments with existing ones when due (that +> often rely on automata), but they seem very close to the existing one +> in any case, with enough inlining. (Yet, I found the proof based on +> tagging functions quite elegant: it avoids the explicit construction +> of an automata--because automata are difficult to formalize in +> HOL--while being close to the intuitive automata constructions, even +> if they are not mentionned.) + +We agree "that underlying our arguments are automata proofs", but +we argued without constructing automata explicitly. This is the very point +and contribution of the paper. If we construct automata explicitly, the +formalised proof of Myhill-Nerode theorem will become much larger as +can be seen in existing work. By abandoning automata and relying only on +regular expressions, we get a much simpler formalised proof of MN theorem. + +> To sum up, I think there is a significant value to come up with fresh +> arguments that makes it easier to formalize existing pen-and-paper +> proofs and textbooks. Yet, the arguments presented in this paper are +> not really new (they look like inlinings, even if the authors chose +> not to see it). And I am not convinced by the idea of side stepping an +> elegant tool, the notion of automata, because it is hard to +> formalize. I would rather improve on the underlying tools, or develop +> new ways to formalize this notion. + +We disagree in that we should not take the opportunity of using +regular expressions when they lead to simpler formalisations +in regular language theory. We have already found that many structural +inductions that only regular expressions afford are good learning +tools for students in order to come to grips with formal reasoning. + + +> - Lemma 21 already appears in Brzozowski (1964) who attributes it to +> Arden. I think there is little value in distinguishing the two +> versions of Arden's lemma, since their proofs and statement are parallel. + +We agree that Brzozowski (1964) gives the statement of Arden's lemma, but +no proof, for which he refers to Arden (1960). Unfortunately, we do not have +access to this book and guess that this applies to many contemporary readers. +Our reference to Sakarovitch is motivated by the fact that he gives +explicitly a proof (XXX CHECK). + +> - I am puzzled by the one-character transition relation between +> equivalence classes (p. 6). The $X_i$ correspond to the states of +> the minimal automaton that recognizes $L$ (and you even agree with +> that in the Related Work section!). Therefore, I disagree with your +> sentence stating that this construction does not build an +> automaton. + +We deleted this comment since it also confused the second reviewer. +What we wanted to express is that we have not defined a transition +relation between states in the usual automata sense. + +> Then, it would follow that the initial system would be better +> presented as a DFA (that is, associating to each equivalence class, +> a mapping from atoms to equivalence classes with the atoms being +> disjoint), even if you do not want to introduce the notion of +> automaton. I fail to see the advantages of the current presentation +> of the initial system. Then, I think that it would be easier to +> understand your presentation if you used a more involved language +> (like (ab + a)) as a running example. This would produce the +> equivalence classes +> +> X1 = {[]} +> X2 = {[a]} +> X3 = {[ab]} +> X4 = {univ - {[], [a], [ab]}} +> +> and the equational system +> +> X1 = LAMBDA(ONE) +> X2 = (X1,a) +> X3 = (X2,b) +> X4 = (X1,b) + (X2,a) + (X3,a) + (X3,b) +> + +We have modified the example as you suggest. But note that assuming +we only have characters a and b in an alphabet, the equation for +X4 must be + + X4 = (X1,b) + (X2,a) + (X3,a) + (X3,b) + (X4,a) + (X4,b) + + +> (It could be worth saying that it is a left-linear context free +> grammar in any case). Having built this example, I wonder why you +> chose to set up the equationnal system the way you did. In the RW +> section, you argue that the differences with Brzozowski's are subtle +> and impact the presentation of , but I fail to see what would go +> wrong if you chose this presentation too. + +The difference is from where we start: In Brzozowski's algebraic +method, the start is an automaton from which this method calculates +a regular expression. Therefore Brzozowski annotates the final +states with lambdas and `propagates' them to the single initial +state. From there he can read off a (single) regular expression recognising +the same language as the automaton. We, on the other hand, start with a +set of equivalence classes, for all of which we want to calculate a +regular expression. In this way we have to annotate the lambda to the +initial state (containing only []) and `propagate' it to all other +equivalence classes. Once there, we can read off the regular expressions +for each equivalence class. There seems to be no convenient way to set +up the equational system the original Brzozowski-way. However, it does +mean that we have to use the reversed version of Arden's lemma in order to +solve our equational systems. + +> - p. 11 "There exists a regular expression" this is not clear, unless +> one has a look at your formal proof: it should be made clear that +> the definition "regular A" is defined as exists r, A = lang r. + +We clarified this sentence according to your suggestion. + +> - Top of p. 12: it is stated that this algorithm provides a method to +> compute a regexp for the complement of a language and that this +> construction is similar to the construction of the complement +> automaton. I agree with this statement, because I am under the +> impression (maybe wrongly) that the construction in this section +> boils down to "automata with states indexed by the equivalences +> classes" to "regular expression". What I am uncertain of is, under +> what conditions I could use your formalization to compute this +> regular expression. (Note that I am not speaking about complexity +> issues here.) + +We are not 100% sure whether we are correctly interpreting your +question: If you are after an informal "black board" calculation method +of this regular expression in front of students, then you can use our +algorithm. We have done this already. If you are asking whether +our formalisation is "constructive" enough in order to directly +extract an algorithm, then no we cannot do this. We use a theorem +prover based on classical logic. We, for example, did not make +explicit how a single regular expression can be constructed from +a set of regular expressions, still recognising the same language. +However, we do not like to open the "can-of-worms" about constructive +and unconstructive arguments and how to extract explicit algorithms. +In theory we can, in practice we cannot. Unfortunately, we do not have +a better answer for this at the moment and think this is besides +the main point of the paper. + +> - It seems that the purpose of the tagging functions here is to build +> right invariant equivalence relations that refines the Myhill-Nerode +> relation (Lemma 43). Maybe you should start by that. Also, any such +> right invariant relation naturally gives rise to an automaton, and +> you build such later on. Maybe you could phrase it out at the very +> beginning of this section. + +We added a long remark at the end of the section that spells +out the automata inspiration behind the tagging functions (see p. 17). + +> - p. 13 In the definition of the tagging function for the plus case, +> isn't there a typo (replacing the \cdot by a comma) ? + +No. Our formalisation uses this definition. + +> - p. 14 There is an "x" missing on the left-hand side of the definition of +> x tag A B. + +Yes. Corrected. + +> - The structure of section 4 is a bit difficult to follow, because the +> proof of Theorem 31 is interleaved with the statement of many +> intermediate lemmas. Could you separate the definition of the +> tagging functions from the proof? + +We have added a figure at the beginning of the proof and included +appropriate references. + +> I think it would also be +> interesting to point out that the type of the codomain the tag +> functions are not the same through the section. Again, I am under +> the impression that the tagging functions defined in this section +> define states of an automaton that is not built explicitly, with the +> result of the tag function being a particular state in a suitably +> built automaton. + +We added a relatively lengthy remark after the proof about +how the tagging functions can be seen as constructing particular +automata - see p. 17. + + +> Yet, it could be worth pointing out that this use +> of polymorphism for the tag functions does not have the same +> problems as the one encountered when one try to formalize +> automata. This is in a sense contradiction with your statement +> p. 24 that you sidestep the composition of automata (this automaton +> is not explicitly built). + +We have modified the statement on p. 24. + +> - p. 18: you mention the fact that it is painful to formalize +> Brzozowski's argument stating that there are only finitely many +> dissimilar derivatives. Is this because it requires to build part of +> the equationnal theory of regular expressions that you avoided so +> far? Could you comment on what Coquand and Siles did in the paper +> you cite? + +Coquand and Siles first spend 3 pages of their 16-pages paper +developing a theory of inductively finite sets in type theory. +This theory uses "discrete sets" where the equality needs to +be decidable. They then view the ACI-properties as rewrite rules +leading to a decidable equality over regular expressions. This +shows (although they left out the details in their paper) that the +set of derivatives is inductively finite. + +In our opinion, Coquand and Siles' development strongly justifies our +comment about formalisation of Brzozowski's argument being painful. +If you like to add this as a reference in the paper, we are happy +to do so. + +> - Section 6, p20, previous to last paragraph: you say that the +> construction of the complement regular expression involves several +> transformations: you could shave off the construction of the +> non-deterministic automaton, since it can be avoided using +> Antimirov's construction of a DFA (as you do). + +We feel that our sequence adequately represents procedure +that is used by informed outsiders and that is usually +taught to students. Antimirov's construction using partial +derivatives is unfortunately not widely known, which we hope +we can help to change with this paper. + +> - (27) are properties, not definitions, isn't it? + +Corrected. + +> - A question. On the one hand, Isabelle/HOL features a classical +> logic, with the law of the excluded middle and the Hilbert's epsilon +> operator. On the other hand, in Coq, it is often involved to prove +> decidability results, which may make it more difficult to formalize +> some of the arguments you are using. Is it possible in Isabelle/HOL +> to track what are the uses of classical logic that you use? + +In theory there is a way to track all classical uses in Isabelle, +but this can be practically unfeasible, since proof terms can be +gigantic (therefore the Isabelle/HOL follows the LCF approach +where only theorems need to be recorded for correctness). Having +said that there is one obvious place in our proof where we used +a "classical" shortcut by using + for generating a single regular +expression from a finite set of expressions. This would need better +"explanation" in a constructive proof. There are no other inherent +classical shortcuts in the proof. But as said before opening we +think opening this can-of-worms is beside the focus of the paper. + + +> In particular, I am worried about the last sentence on page 24: how do +> you make these sets computational? + +We wrote: + + However, since partial derivatives use sets of regular expressions, one + needs to carefully analyse whether the resulting algorithm is still + executable. Given the infrastructure for executable sets introduced by + Haftmann (2009) in Isabelle/HOL, it should. + +Does this allay your fears? + +> - I think a word of caution should be added to the comparison with +> related works in terms of formalization size: there is a huge gap in +> the scope of these formalizations efforts, especially when one +> defines algorithms. For instance, an usual corollary of the +> Myhill-Nerode theorem is an algorithm that minimize automata: could +> you provide, for instance, an algorithm that compute the number of +> equivalences classes of your Myhill-Nerode relation? + +We have attempted to be very careful with our comments about the +formalisation sizes from other works. Our intention is only to draw +attention to the somewhat surprising point, namely that formalisation +using automata (either out of choice or being forced) led to +formalisations where the people in question had to fight their +theorem prover. On the other hand, our formalisation could rely +on the fact that regular expressions are inductive datatypes which +are generally well-supported by theorem prover reasoning. We hope +to have "balanced the picture" by saying that regular expressions +are not good for everything (see paragraph in the Conclusion +"While regular expressions are convenient, they have some +limitations..."). + +We have not thought about an algorithm to compute the number of +equivalence classes. + +> - There has been several work that uses matrices in Coq, in the +> context of the ssreflect dialect. For instance, you may have a look +> at Dénès M, Mörtberg A, Siles V. 2012. A refinement-based approach +> to computational algebra in COQ. ITP 2012. It does not seem that +> there is a particular problem with the use of matrices. + +We relied completely on the comment by (expert formaliser in Coq) Braibant +from his thesis which explicitly states + + | Note that while they [Wu, Zhang, Urban] justify their work by the fact + | that formalising automata ``can be a real hassle in HOL-based provers'', + | and that neither Isabelle/HOL nor HOL4 nor HOL-light support matrices and + | graphs with libraries, we did not encounter such difficulties in Coq + | (omitting the **intrinsic difficulty** of working with **rectangular matrices** + | at some points). + +The emphasis added by us. As far as we know Braibant also uses ssreflect. + + + +Reviewer #2 +=========== + +> This is a very nice paper, at least in terms of its content, and it +> should certainly be published in some shape or form. It's based on an +> ITP conference paper from 2011, but seems to me to include sufficient +> new content to justify republication in a journal. To be specific, +> the new material is: the second of the two proofs showing the +> implication that if a language is regular it has a finite number of +> Myhill-Nerode equivalence classes (using partial derivatives), and all +> of the closure properties. Both new contributions are very appealing. + +We agree with your observations. + +> Really, I have only a slew of minor criticisms about the paper, and +> these almost all have to do with minor presentation issues. I started +> to list some of these in the "Minor things" section below, but I +> eventually decided that I should simply say that 1. the English is +> sometimes less than perfect; 2. a spell-checker should be used; 3. the +> typography is occasionally dubious; and that 4. it all needs to be +> re-read with more care. + +We have improved the paper. Please let us know if we +overlooked any more errors. + +> Finally, the conclusion and related work sections are reasonable. I +> think the conclusion could usefully discuss the executability of the +> algorithm in the first half of the M-N proof. + +Informally, the algorithm is clearly executable. But we have not +attempted this with our formalisation. + +> Also, the conclusion discusses the Brzozowski's algebraic method in +> this context, and argues that the equational constraints that appear +> in this paper are not quite the same as a minimal automaton. It's +> clear that, yes, they are a little different, but it still seems to +> me that the initial state of the constraint set is isomorphic to the +> automaton. Given that, I was very surprised to read (on p6) that +> "Note that we do not define an automaton here". + +Yes, we agree, behind our equational system lurks a minimal automaton. What +we wanted to clear up is that we have not defined a transition relation +between states in the usual automata sense. Despite best intentions, we +have deleted this remark, as it clearly misled the reviewers. + +> Appendix: this seems unnecessary. If it's interesting enough to be in +> the main text, it should be there. Otherwise refer the reader to the +> online formalisation. In this case, the appendix is so small that it +> seems particularly silly to have one. + +We deleted the appendix. \ No newline at end of file diff -r c028b07a4fa8 -r 99161cd17c0f Journal/Paper.thy --- a/Journal/Paper.thy Wed Jul 10 13:28:24 2013 +0100 +++ b/Journal/Paper.thy Wed Jul 10 17:47:30 2013 +0100 @@ -563,7 +563,7 @@ Central to our proof will be the solution of equational systems involving equivalence classes of languages. For this we will use Arden's Lemma - (see for example \cite[Page 100]{Sakarovitch09}), + (see for example \citet[Page 100]{Sakarovitch09}), which solves equations of the form @{term "X = A \ X \ B"} provided @{term "[] \ A"}. However we will need the following `reversed' version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \ X"} to @@ -659,16 +659,18 @@ It is easy to see that @{term "\A"} is an equivalence relation, which partitions the set of all strings, @{text "UNIV"}, into a set of disjoint equivalence classes. To illustrate this quotient construction, let us give a simple - example: consider the regular language containing just - the string @{text "[c]"}. The relation @{term "\({[c]})"} partitions @{text UNIV} - into three equivalence classes @{text "X\<^isub>1"}, @{text "X\<^isub>2"} and @{text "X\<^isub>3"} + example: consider the regular language built up over the alphabet @{term "{a, b}"} and + containing just the string two strings @{text "[a]"} and @{text "[a, b]"}. The + relation @{term "\({[a], [a, b]})"} partitions @{text UNIV} + into four equivalence classes @{text "X\<^isub>1"}, @{text "X\<^isub>2"}, @{text "X\<^isub>3"} and @{text "X\<^isub>4"} as follows \begin{center} \begin{tabular}{l} @{text "X\<^isub>1 = {[]}"}\\ - @{text "X\<^isub>2 = {[c]}"}\\ - @{text "X\<^isub>3 = UNIV - {[], [c]}"} + @{text "X\<^isub>2 = {[a]}"}\\ + @{text "X\<^isub>3 = {[a, b]}"}\\ + @{text "X\<^isub>4 = UNIV - {[], [a], [a, b]}"} \end{tabular} \end{center} @@ -689,8 +691,8 @@ \end{equation} \noindent - In our running example, @{text "X\<^isub>2"} is the only - equivalence class in @{term "finals {[c]}"}. + In our running example, @{text "X\<^isub>2"} and @{text "X\<^isub>3"} are the only + equivalence classes in @{term "finals {[a], [a, b]}"}. It is straightforward to show that in general % \begin{equation}\label{finalprops} @@ -723,10 +725,16 @@ %Note that we do not define formally an automaton here, %we merely relate two sets %(with the help of a character). - In our concrete example we have - @{term "X\<^isub>1 \c\ X\<^isub>2"}, @{term "X\<^isub>1 \d\<^isub>i\ X\<^isub>3"} with @{text "d\<^isub>i"} being any - other character than @{text c}, and @{term "X\<^isub>3 \c\<^isub>j\ X\<^isub>3"} for any - character @{text "c\<^isub>j"}. + In our concrete example we have + + \begin{center} + \begin{tabular}{l} + @{term "X\<^isub>1 \a\ X\<^isub>2"},\; @{term "X\<^isub>1 \b\ X\<^isub>4"};\\ + @{term "X\<^isub>2 \b\ X\<^isub>3"},\; @{term "X\<^isub>2 \a\ X\<^isub>4"};\\ + @{term "X\<^isub>3 \a\ X\<^isub>4"},\; @{term "X\<^isub>3 \b\ X\<^isub>4"} and\\ + @{term "X\<^isub>4 \a\ X\<^isub>4"},\; @{term "X\<^isub>4 \b\ X\<^isub>4"}. + \end{tabular} + \end{center} Next we construct an \emph{initial equational system} that contains an equation for each equivalence class. We first give @@ -773,17 +781,14 @@ \begin{equation}\label{exmpcs} \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} @{term "X\<^isub>1"} & @{text "="} & @{text "\(ONE)"}\\ - @{term "X\<^isub>2"} & @{text "="} & @{text "(X\<^isub>1, ATOM c)"}\\ - @{term "X\<^isub>3"} & @{text "="} & @{text "(X\<^isub>1, ATOM d\<^isub>1) + \ + (X\<^isub>1, ATOM d\<^isub>n)"}\\ - & & \mbox{}\hspace{10mm}@{text "+ (X\<^isub>3, ATOM c\<^isub>1) + \ + (X\<^isub>3, ATOM c\<^isub>m)"} + @{term "X\<^isub>2"} & @{text "="} & @{text "(X\<^isub>1, ATOM a)"}\\ + @{term "X\<^isub>3"} & @{text "="} & @{text "(X\<^isub>2, ATOM b)"}\\ + @{term "X\<^isub>4"} & @{text "="} & @{text "(X\<^isub>1, ATOM b) + (X\<^isub>2, ATOM a) + (X\<^isub>3, ATOM a)"}\\ + & & \mbox{}\hspace{0mm}@{text "+ (X\<^isub>3, ATOM b) + (X\<^isub>4, ATOM a) + (X\<^isub>4, ATOM b)"}\\ \end{tabular}} \end{equation} - + \noindent - where @{text "d\<^isub>1\d\<^isub>n"} is the sequence of all characters - but not containing @{text c}, and @{text "c\<^isub>1\c\<^isub>m"} is the sequence of all - characters. - Overloading the function @{text \} for the two kinds of terms in the equational system, we have @@ -883,26 +888,23 @@ then we calculate the combined regular expressions for all @{text r} coming from the deleted @{text "(X, r)"}, and take the @{const Star} of it; finally we append this regular expression to @{text rhs'}. If we apply this - operation to the right-hand side of @{text "X\<^isub>3"} in \eqref{exmpcs}, we obtain + operation to the right-hand side of @{text "X\<^isub>4"} in \eqref{exmpcs}, we obtain the equation: \begin{center} \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} - @{term "X\<^isub>3"} & @{text "="} & - @{text "(X\<^isub>1, TIMES (ATOM d\<^isub>1) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\, ATOM c\<^isub>m})) + \ "}\\ - & & \mbox{}\hspace{13mm} - @{text "\ + (X\<^isub>1, TIMES (ATOM d\<^isub>n) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\, ATOM c\<^isub>m}))"} + @{term "X\<^isub>4"} & @{text "="} & + @{text "(X\<^isub>1, TIMES (ATOM b) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b})) +"}\\ + & & @{text "(X\<^isub>2, TIMES (ATOM a) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b})) +"}\\ + & & @{text "(X\<^isub>3, TIMES (ATOM a) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b})) +"}\\ + & & @{text "(X\<^isub>3, TIMES (ATOM b) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b}))"} \end{tabular} \end{center} \noindent - That means we eliminated the recursive occurrence of @{text "X\<^isub>3"} on the - right-hand side. Note we used the abbreviation - @{text "\<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\, ATOM c\<^isub>m}"} - to stand for a regular expression that matches with every character. In - our algorithm we are only interested in the existence of such a regular expression - and do not specify it any further. + That means we eliminated the recursive occurrence of @{text "X\<^isub>4"} on the + right-hand side. It can be easily seen that the @{text "Arden"}-operation mimics Arden's Lemma on the level of equations. To ensure the non-emptiness condition of @@ -951,7 +953,7 @@ % \begin{equation}\label{exmpresult} \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} - @{term "X\<^isub>2"} & @{text "="} & @{text "\(TIMES ONE (ATOM c))"} + @{term "X\<^isub>2"} & @{text "="} & @{text "\(TIMES ONE (ATOM a))"} \end{tabular}} \end{equation} @@ -1194,7 +1196,15 @@ \begin{proof}[of Theorem~\ref{myhillnerodeone}] By Lemma~\ref{every_eqcl_has_reg} we know that there exists a regular expression for - every equivalence class in @{term "UNIV // \A"}. Since @{text "finals A"} is + every equivalence class in @{term "UNIV // \A"}: + + \[ + \mbox{if}\;@{term "X \ (UNIV // \A)"}\;\mbox{then there exists a}\; + @{text "r"}\;\mbox{such that}\;@{term "X = lang r"} + \]\smallskip + + \noindent + Since @{text "finals A"} is a subset of @{term "UNIV // \A"}, we also know that for every equivalence class in @{term "finals A"} there exists a regular expression. Moreover by assumption we know that @{term "finals A"} must be finite, and therefore there must be a finite @@ -1205,12 +1215,27 @@ \end{proof} \noindent + Solving the equational system of our running example gives as solution for the + two final equivalence classes: + + \begin{center} + \begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{1mm}}l} + @{text "X\<^isub>2"} & @{text "="} & @{text "TIMES ONE (ATOM a)"}\\ + @{text "X\<^isub>3"} & @{text "="} & @{text "TIMES (TIMES ONE (ATOM a)) (ATOM b)"} + \end{tabular} + \end{center} + + \noindent + Combining them with $\bigplus$ gives us a regular expression for the language @{text "{[a], [a, b]}"}. + Note that our algorithm for solving equational systems provides also a method for calculating a regular expression for the complement of a regular language: if we combine all regular - expressions corresponding to equivalence classes not in @{term "finals A"}, + expressions corresponding to equivalence classes not in @{term "finals A"} + (in the running example @{text "X\<^isub>1"} and @{text "X\<^isub>4"}), then we obtain a regular expression for the complement language @{term "- A"}. This is similar to the usual construction of a `complement automaton'. + *} section {* Myhill-Nerode, Second Part *} @@ -1296,10 +1321,13 @@ \noindent For constructing @{text R}, we will rely on some \emph{tagging-functions} - defined over strings. Given the inductive hypothesis, it will be easy to + defined over strings, see Fig.~\ref{tagfig}. They are parameterised by sets + of strings @{text A} and @{text B} standing for the induction hypotheses. + Given the inductive hypotheses, it will be easy to prove that the \emph{range} of these tagging-functions is finite. The range of a function @{text f} is defined as + \begin{center} @{text "range f \ f ` UNIV"} \end{center} @@ -1334,6 +1362,24 @@ is finite, then the set @{text A} itself must be finite. We can use it to establish the following two lemmas. + \begin{figure}[t] + \normalsize + \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} + @{thm (lhs) tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]} & + @{text "\"} & + @{thm (rhs) tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]} \\ + @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}\;@{text "x"} & @{text "\"} & + @{text "(\x\\<^bsub>\A\<^esub>, {\x\<^isub>s\\<^bsub>\B\<^esub> | x\<^isub>p \ A \ (x\<^isub>p, x\<^isub>s) \ Partitions x})"}\\ + @{thm (lhs) tag_Star_def[where ?A="A", THEN meta_eq_app]} & @{text "\"} & + @{text "{\x\<^isub>s\\<^bsub>\A\<^esub> | x\<^isub>p < x \ x\<^isub>p \ A\<^isup>\ \ (x\<^isub>p, x\<^isub>s) \ Partitions x}"} + \end{tabular} + \caption{Three tagging functions used the cases for @{term "PLUS"}, @{term "TIMES"} and @{term "STAR"} + regular expressions. The sets @{text A} and @{text B} are arbitrary languages instantiated + in the proof with the induction hypotheses. \emph{Partitions} is a function + that generates all possible partitions of a string.\label{tagfig}} + \end{figure} + + \begin{lemma}\label{finone} @{thm[mode=IfThen] finite_eq_tag_rel} \end{lemma} @@ -1379,7 +1425,7 @@ Chaining Lemma~\ref{finone} and \ref{fintwo} together, means in order to show that @{term "UNIV // \(lang r)"} is finite, we have to construct a tagging-function whose range can be shown to be finite and whose tagging-relation refines @{term "\(lang r)"}. - Let us attempt the @{const PLUS}-case first. We take as tagging-function + Let us attempt the @{const PLUS}-case first. We take as tagging-function from Fig.~\ref{tagfig} \begin{center} @{thm tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]} @@ -1510,7 +1556,7 @@ tagging-function in the @{const Times}-case as: \begin{center} - @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\"}~ + @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}\;@{text "x"}~@{text "\"}~ @{text "(\x\\<^bsub>\A\<^esub>, {\x\<^isub>s\\<^bsub>\B\<^esub> | x\<^isub>p \ A \ (x\<^isub>p, x\<^isub>s) \ Partitions x})"} \end{center} @@ -1679,6 +1725,58 @@ A\"}, which means @{term "y @ z \ A\"}. The last step is to set @{text "A"} to @{term "lang r"} and thus complete the proof. \end{proof} + + \begin{rmk} + While our proof using tagging functions might seem like a rabbit pulled + out of a hat, the intuition behind can be rationalised taking the + view that the equivalence classes @{term "UNIV // \(lang r)"} stand for the + states of the minimal automaton for the language @{term "lang r"}. The theorem + amounts to showing that this minimal automaton has finitely many states. + However, by showing that our @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} relation + refines @{term "\A"} we do not actually have to show that the minimal automata + has finitely many states, but only that we can show finiteness for an + automaton with at least as many states and which can recognise the same + language. By performing the induction over the regular expression @{text r}, + this means we have to construct inductively an automaton in + the cases for @{term PLUS}, @{term TIMES} and @{term STAR}. + + In the @{text PLUS}-case, we can assume finitely many equivalence classes of the form + @{text "\_\\<^bsub>\A\<^esub>"} and @{text "\_\\<^bsub>\B\<^esub>"}, each standing for an automaton recognising the + languages @{text A} and @{text B} respectively. + The @{text "+tag"} function constructs pairs of the form @{text "(\_\\<^bsub>\A\<^esub>, \_\\<^bsub>\B\<^esub>)"} + which can be seen as the states of an automaton recognising the language + \mbox{@{term "A \ B"}}. This is the usual product construction of automata: + Given a string @{text x}, the first automata will be in the state @{text "\x\\<^bsub>\A\<^esub>"} + after recognising @{text x} (similarly @{text "\x\\<^bsub>\B\<^esub>"} for the other automaton). The product + automaton will be in the state \mbox{@{text "(\x\\<^bsub>\A\<^esub>, \x\\<^bsub>\B\<^esub>)"}}, which is accepting + if at least one component is an accepting state. + + The @{text "TIMES"}-case is a bit more complicated---essentially we + need to sequentially compose the two automata with the states @{text "\_\\<^bsub>\A\<^esub>"}, + respectively @{text "\_\\<^bsub>\B\<^esub>"}. We achieve this sequential composition by + taking the first automaton @{text "\_\\<^bsub>\A\<^esub>"} and append on each of its accepting + state the automaton @{text "\_\\<^bsub>\B\<^esub>"}. Unfortunately, this does not lead directly + to a correct composition, since the automaton @{text "\_\\<^bsub>\A\<^esub>"} might have consumed + some of the input needed for the automaton @{text "\_\\<^bsub>\B\<^esub>"} to succeed. The + textbook construction for sequentially composing two automata circumvents this + problem by connecting the terminating states of the first automaton via + an epsilon-transition to the initial state of the second automaton (see for + example \citeN{Kozen97}). In the absence of any epsilon-transition analogue in our work, + we let the second automaton + start in all possible states that may have been reached if the first + automaton had not consumed the input meant for the second. We achieve this + by having a set of states as the second component generated by our + @{text "\tag"} function (see second clause in Fig.~\ref{tagfig}). + A state of this sequentially composed automaton is accepting, if the first + component is accepting and at least one state in the set is also accepting. + + The idea behind the @{text "STAR"}-case is similar to the @{text "TIMES"}-case. + We assume some automaton has consumed some strictly smaller part of the input; + we need to check that from the state we ended up in a terminal state in the + automaton @{text "\_\\<^bsub>\A\<^esub>"}. Since we do not know from which state this will + succeed, we need to run the automaton from all possible states we could have + ended up in. Therefore the @{text "\tag"} function generates a set of states. + \end{rmk} *} section {* Second Part proved using Partial Derivatives *} @@ -2211,12 +2309,12 @@ % \begin{equation}\label{supseqprops} \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} - @{thm (lhs) SUPSEQ_simps(1)} & @{text "\"} & @{thm (rhs) SUPSEQ_simps(1)}\\ - @{thm (lhs) SUPSEQ_simps(2)} & @{text "\"} & @{thm (rhs) SUPSEQ_simps(2)}\\ - @{thm (lhs) SUPSEQ_atom} & @{text "\"} & @{thm (rhs) SUPSEQ_atom}\\ - @{thm (lhs) SUPSEQ_union} & @{text "\"} & @{thm (rhs) SUPSEQ_union}\\ - @{thm (lhs) SUPSEQ_conc} & @{text "\"} & @{thm (rhs) SUPSEQ_conc}\\ - @{thm (lhs) SUPSEQ_star} & @{text "\"} & @{thm (rhs) SUPSEQ_star} + @{thm (lhs) SUPSEQ_simps(1)} & @{text "="} & @{thm (rhs) SUPSEQ_simps(1)}\\ + @{thm (lhs) SUPSEQ_simps(2)} & @{text "="} & @{thm (rhs) SUPSEQ_simps(2)}\\ + @{thm (lhs) SUPSEQ_atom} & @{text "="} & @{thm (rhs) SUPSEQ_atom}\\ + @{thm (lhs) SUPSEQ_union} & @{text "="} & @{thm (rhs) SUPSEQ_union}\\ + @{thm (lhs) SUPSEQ_conc} & @{text "="} & @{thm (rhs) SUPSEQ_conc}\\ + @{thm (lhs) SUPSEQ_star} & @{text "="} & @{thm (rhs) SUPSEQ_star} \end{tabular}} \end{equation} @@ -2365,24 +2463,9 @@ exists a regular expression that matches all of its strings. Regular expressions can conveniently be defined as a datatype in theorem provers. For us it was therefore interesting to find out how far we can push - this point of view. But this question whether formal language theory can - be done without automata crops up also in non-theorem prover contexts. For - example, Gasarch asked in the Computational Complexity blog by \citeN{GasarchBlog} - whether the complementation of - regular-expression languages can be proved without using automata. He concluded - - \begin{quote} - {\it \ldots it can't be done} - \end{quote} - - \noindent - and even pondered - - \begin{quote} - {\it \ldots [b]ut is there a rigorous way to even state that?} - \end{quote} - - \noindent + this point of view. But this question whether regular language theory can + be done without automata crops up also in non-theorem prover contexts. Recall + Gasarch's speculation cited in the Introduction. We have established in Isabelle/HOL both directions of the Myhill-Nerode Theorem. % @@ -2398,7 +2481,7 @@ Pumping Lemma). We can also use it to establish the standard textbook results about closure properties of regular languages. The case of closure under complement follows easily from the Myhill-Nerode Theorem. - So our answer to Gasarch is ``{\it it can be done!''} + So our answer to Gasarch is ``{\it yes we can!''} %Our insistence on regular expressions for proving the Myhill-Nerode Theorem %arose from the problem of defining formally the regularity of a language. @@ -2465,15 +2548,15 @@ in the `pencil-and-paper-reasoning' literature about our way of proving the first direction of the Myhill-Nerode Theorem, but it appears to be folklore. - We presented two proofs for the second direction of the Myhill-Nerode - Theorem. One direct proof using tagging-functions and another using partial - derivatives. This part of our work is where our method using regular - expressions shines, because we can completely side-step the standard - argument (for example used by \citeN{Kozen97}) where automata need to be composed. However, it is - also the direction where we had to spend most of the `conceptual' time, as - our first proof based on tagging-functions is new for establishing the - Myhill-Nerode Theorem. All standard proofs of this direction proceed by - arguments over automata. + We presented two proofs for the second direction of the + Myhill-Nerode Theorem. One direct proof using tagging-functions and + another using partial derivatives. This part of our work is where + our method using regular expressions shines, because we can perform + an induction on the structure of refular expressions. However, it is + also the direction where we had to spend most of the `conceptual' + time, as our first proof based on tagging-functions is new for + establishing the Myhill-Nerode Theorem. All standard proofs of this + direction proceed by arguments over automata. The indirect proof for the second direction arose from our interest in Brzozowski's derivatives for regular expression matching. While \citeN{Brzozowski64} diff -r c028b07a4fa8 -r 99161cd17c0f Journal/document/root.tex --- a/Journal/document/root.tex Wed Jul 10 13:28:24 2013 +0100 +++ b/Journal/document/root.tex Wed Jul 10 17:47:30 2013 +0100 @@ -42,7 +42,7 @@ \newcommand{\bigplus}{\mbox{\Large\bf$+$}} -%\spnewtheorem{thrm}{Theorem}[section]{\bf}{\it} +\spnewtheorem*{rmk}{Remark}{\bf}{} %\spnewtheorem{lmm}{Lemma}[section]{\bf}{\it} %\spnewtheorem{dfntn}{Definition}[section]{\bf}{\it} %\spnewtheorem{prpstn}{Proposition}[section]{\bf}{\it} diff -r c028b07a4fa8 -r 99161cd17c0f journal.pdf Binary file journal.pdf has changed