--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/Journal/JAR-response Wed Jul 10 17:47:30 2013 +0100
@@ -0,0 +1,474 @@
+We thank the reviewers for their time and insightful comments.
+We have incorporated most comments and if not, then answered them
+below. We also spell-checked the paper. Thanks again for all the
+comments.
+
+Reviewer #1
+===========
+
+> Indeed, one of the premise of the paper is that representing automata
+> in a theorem prover requires to fight against the theorem prover or
+> requires painful amount of details. Therefore, the authors define the
+> notion of regular languages as a language generated by a regular
+> expression. (This is not an unusual choice from a formalization point
+> of view, provided complemented by a proof of Kleene's theorem that
+> states such regular languages are exactly the one that are recognized
+> by finite automata.)
+
+We concur with the reviewer that our definition of regular languages
+as the ones generated by regular expressions is usual, but the paper is
+unusual in that we do not need to provide a version of Kleene's theorem.
+All results are obtained without formally needing a definition for automata.
+If we would need Kleene's theorem as part of our arguments, then we would
+need such a definition.
+
+To make our intention with the paper more clear, we moved the quote
+by Gasarch from the Conclusion to the Introduction. In this quote he asks
+whether the complementation of regular-expression languages can be proved
+without using automata. In this way we sharpen our motivation by being
+theorem prover related, but also of general interest. See quotations and
+changes right after Definition 2, p. 3.
+
+> Second, the premise of this article is that formalizing automata
+> theory require some fight with the theorem prover, and that several
+> other (longer) formalizations from the related work substantiate this
+> claim. I am under the impression that this comparison is misleading
+> since some of the related work deals not only with regular language
+> \emph{theory}, but also regular language \emph{algorithms}: e.g.,
+> automata constructions, conversion from NFA to DFA, minimization,
+> decision procedures for language equivalence. These algorithms may be
+> proved correct, complete, or terminating; and may be more or less
+> efficient. Then, these properties dictate some choices of
+> representation of the objects of the formalization (and hence have
+> consequences on its intricacies).
+
+Yes, we agree with the reviewer that *if* one is in the business of
+proving the correctness of, for example, Hopcroft's algorithm or the
+NFA-to-DFA conversion, which are based on "concrete" automata, then yes
+one clearly has to use the usual notion of the automata somewhere, even
+might have to start from it.
+
+However, we disagree with the interpretation that our comparison is
+misleading. Our claim that usual `paper' definitions do not translate
+smoothly into formalised definitions is not a claim just by us and
+independent of any numbers: We referenced in the paper to a concrete
+problem pointed out already in Slind's email cited on the bottom of
+p. 2. In the context of HOL4, he made the observation
+
+ | A student who is attempting to prove the pumping lemma for regular
+ | languages came across the following problem. Suppose a DFA is
+ | represented by a record <| init ; trans ; accept |> where
+ |
+ | init : 'state
+ | trans : 'state -> 'symbol -> 'state
+ | accept : 'state -> bool
+ |
+ | Execution of a DFA on a string:
+ |
+ | (Execute machine [] s = s)
+ | (Execute machine (h::t) s = Execute machine t (machine.trans s h))
+ |
+ | Language recognized by a machine D:
+ |
+ | Lang(D) = {x | D.accept (Execute D x D.init)}
+ |
+ | These are all accepted by HOL with no problem. In then trying to define
+ | the set of regular languages, namely those accepted by some DFA, the
+ | following is natural to try
+ |
+ | Regular (L:'symbol list -> bool) = ?D. L = Lang(D)
+ |
+ | But this fails, since D has two type variables ('state and 'symbol) but
+ | Regular is a predicate on 'symbol lists. So the principle of definition
+ | rejects.
+
+We also pointed to comments by Tobias Nipkow which clearly show that
+he had to fight the theorem prover. Many formalisations we cited for
+comparison contain statements about having to fight their respective
+theorem prover.
+
+The conclusion we want the reader to draw from our comparisons is
+that if one is just interested in formalising results from regular language
+theory, not results from automata theory, then regular expressions are
+easier to work with formally. If you look at the numbers (especially the
+ones for Isabelle/HOL), then we hope you agree with us that the startup
+costs are much higher with automata in comparison with regular
+expressions. We carefully commented on the examples we have cited in
+the Conclusion. Nowhere did we claim that they are one-to-one comparisons with
+our development. We did, for example, say that Braibant agreed with us that
+*if* he had done just Myhill-Nerode using his library, he estimates our
+development is more concise (see his comment in the Conclusion, p. 25).
+
+> In this article, the authors choose
+> to do the converse: avoid the notion of automata altogether, and see
+> how far it goes. I find this choice a bit peculiar.
+
+At first sight, our choice/approach might seem peculiar, since
+we are not primarily verifying an algorithm. However, we are not
+the only ones who pondered about this approach (we mentioned Gasarch).
+We showed that this is indeed a choice that can lead you quite far
+with regular language theory and lead to a relatively smooth
+formalisation. We did not make the claim that regular expressions are
+a substitute for automata in all situations. See our paragraph
+in the Conclusion starting with "While regular expressions are
+convenient, they have some limitations."
+
+> Third, I wonder to what extent some of the proofs presented in this
+> article are a rephrasing of standard proofs (that go through
+> automata), inlining the notion of automata, e.g., the construction
+> depicted in Section 3. Also, the notions of derivatives and partial
+> derivatives underly some automata constructions: for instance, from my
+> point of view, Section 5 is actually building a DFA (with transition
+> relation $\delta$ and initial state ${r}$)from a regular expression
+> without saying so; and it suffices to take as a tag function the
+> expression $\lambda x.\delta({r},x)$ (which is a right invariant
+> equivalence relation between strings, and of finite index since there
+> finitely many states in the automaton). The authors take care to note
+> the proximity of their arguments with existing ones when due (that
+> often rely on automata), but they seem very close to the existing one
+> in any case, with enough inlining. (Yet, I found the proof based on
+> tagging functions quite elegant: it avoids the explicit construction
+> of an automata--because automata are difficult to formalize in
+> HOL--while being close to the intuitive automata constructions, even
+> if they are not mentionned.)
+
+We agree "that underlying our arguments are automata proofs", but
+we argued without constructing automata explicitly. This is the very point
+and contribution of the paper. If we construct automata explicitly, the
+formalised proof of Myhill-Nerode theorem will become much larger as
+can be seen in existing work. By abandoning automata and relying only on
+regular expressions, we get a much simpler formalised proof of MN theorem.
+
+> To sum up, I think there is a significant value to come up with fresh
+> arguments that makes it easier to formalize existing pen-and-paper
+> proofs and textbooks. Yet, the arguments presented in this paper are
+> not really new (they look like inlinings, even if the authors chose
+> not to see it). And I am not convinced by the idea of side stepping an
+> elegant tool, the notion of automata, because it is hard to
+> formalize. I would rather improve on the underlying tools, or develop
+> new ways to formalize this notion.
+
+We disagree in that we should not take the opportunity of using
+regular expressions when they lead to simpler formalisations
+in regular language theory. We have already found that many structural
+inductions that only regular expressions afford are good learning
+tools for students in order to come to grips with formal reasoning.
+
+
+> - Lemma 21 already appears in Brzozowski (1964) who attributes it to
+> Arden. I think there is little value in distinguishing the two
+> versions of Arden's lemma, since their proofs and statement are parallel.
+
+We agree that Brzozowski (1964) gives the statement of Arden's lemma, but
+no proof, for which he refers to Arden (1960). Unfortunately, we do not have
+access to this book and guess that this applies to many contemporary readers.
+Our reference to Sakarovitch is motivated by the fact that he gives
+explicitly a proof (XXX CHECK).
+
+> - I am puzzled by the one-character transition relation between
+> equivalence classes (p. 6). The $X_i$ correspond to the states of
+> the minimal automaton that recognizes $L$ (and you even agree with
+> that in the Related Work section!). Therefore, I disagree with your
+> sentence stating that this construction does not build an
+> automaton.
+
+We deleted this comment since it also confused the second reviewer.
+What we wanted to express is that we have not defined a transition
+relation between states in the usual automata sense.
+
+> Then, it would follow that the initial system would be better
+> presented as a DFA (that is, associating to each equivalence class,
+> a mapping from atoms to equivalence classes with the atoms being
+> disjoint), even if you do not want to introduce the notion of
+> automaton. I fail to see the advantages of the current presentation
+> of the initial system. Then, I think that it would be easier to
+> understand your presentation if you used a more involved language
+> (like (ab + a)) as a running example. This would produce the
+> equivalence classes
+>
+> X1 = {[]}
+> X2 = {[a]}
+> X3 = {[ab]}
+> X4 = {univ - {[], [a], [ab]}}
+>
+> and the equational system
+>
+> X1 = LAMBDA(ONE)
+> X2 = (X1,a)
+> X3 = (X2,b)
+> X4 = (X1,b) + (X2,a) + (X3,a) + (X3,b)
+>
+
+We have modified the example as you suggest. But note that assuming
+we only have characters a and b in an alphabet, the equation for
+X4 must be
+
+ X4 = (X1,b) + (X2,a) + (X3,a) + (X3,b) + (X4,a) + (X4,b)
+
+
+> (It could be worth saying that it is a left-linear context free
+> grammar in any case). Having built this example, I wonder why you
+> chose to set up the equationnal system the way you did. In the RW
+> section, you argue that the differences with Brzozowski's are subtle
+> and impact the presentation of , but I fail to see what would go
+> wrong if you chose this presentation too.
+
+The difference is from where we start: In Brzozowski's algebraic
+method, the start is an automaton from which this method calculates
+a regular expression. Therefore Brzozowski annotates the final
+states with lambdas and `propagates' them to the single initial
+state. From there he can read off a (single) regular expression recognising
+the same language as the automaton. We, on the other hand, start with a
+set of equivalence classes, for all of which we want to calculate a
+regular expression. In this way we have to annotate the lambda to the
+initial state (containing only []) and `propagate' it to all other
+equivalence classes. Once there, we can read off the regular expressions
+for each equivalence class. There seems to be no convenient way to set
+up the equational system the original Brzozowski-way. However, it does
+mean that we have to use the reversed version of Arden's lemma in order to
+solve our equational systems.
+
+> - p. 11 "There exists a regular expression" this is not clear, unless
+> one has a look at your formal proof: it should be made clear that
+> the definition "regular A" is defined as exists r, A = lang r.
+
+We clarified this sentence according to your suggestion.
+
+> - Top of p. 12: it is stated that this algorithm provides a method to
+> compute a regexp for the complement of a language and that this
+> construction is similar to the construction of the complement
+> automaton. I agree with this statement, because I am under the
+> impression (maybe wrongly) that the construction in this section
+> boils down to "automata with states indexed by the equivalences
+> classes" to "regular expression". What I am uncertain of is, under
+> what conditions I could use your formalization to compute this
+> regular expression. (Note that I am not speaking about complexity
+> issues here.)
+
+We are not 100% sure whether we are correctly interpreting your
+question: If you are after an informal "black board" calculation method
+of this regular expression in front of students, then you can use our
+algorithm. We have done this already. If you are asking whether
+our formalisation is "constructive" enough in order to directly
+extract an algorithm, then no we cannot do this. We use a theorem
+prover based on classical logic. We, for example, did not make
+explicit how a single regular expression can be constructed from
+a set of regular expressions, still recognising the same language.
+However, we do not like to open the "can-of-worms" about constructive
+and unconstructive arguments and how to extract explicit algorithms.
+In theory we can, in practice we cannot. Unfortunately, we do not have
+a better answer for this at the moment and think this is besides
+the main point of the paper.
+
+> - It seems that the purpose of the tagging functions here is to build
+> right invariant equivalence relations that refines the Myhill-Nerode
+> relation (Lemma 43). Maybe you should start by that. Also, any such
+> right invariant relation naturally gives rise to an automaton, and
+> you build such later on. Maybe you could phrase it out at the very
+> beginning of this section.
+
+We added a long remark at the end of the section that spells
+out the automata inspiration behind the tagging functions (see p. 17).
+
+> - p. 13 In the definition of the tagging function for the plus case,
+> isn't there a typo (replacing the \cdot by a comma) ?
+
+No. Our formalisation uses this definition.
+
+> - p. 14 There is an "x" missing on the left-hand side of the definition of
+> x tag A B.
+
+Yes. Corrected.
+
+> - The structure of section 4 is a bit difficult to follow, because the
+> proof of Theorem 31 is interleaved with the statement of many
+> intermediate lemmas. Could you separate the definition of the
+> tagging functions from the proof?
+
+We have added a figure at the beginning of the proof and included
+appropriate references.
+
+> I think it would also be
+> interesting to point out that the type of the codomain the tag
+> functions are not the same through the section. Again, I am under
+> the impression that the tagging functions defined in this section
+> define states of an automaton that is not built explicitly, with the
+> result of the tag function being a particular state in a suitably
+> built automaton.
+
+We added a relatively lengthy remark after the proof about
+how the tagging functions can be seen as constructing particular
+automata - see p. 17.
+
+
+> Yet, it could be worth pointing out that this use
+> of polymorphism for the tag functions does not have the same
+> problems as the one encountered when one try to formalize
+> automata. This is in a sense contradiction with your statement
+> p. 24 that you sidestep the composition of automata (this automaton
+> is not explicitly built).
+
+We have modified the statement on p. 24.
+
+> - p. 18: you mention the fact that it is painful to formalize
+> Brzozowski's argument stating that there are only finitely many
+> dissimilar derivatives. Is this because it requires to build part of
+> the equationnal theory of regular expressions that you avoided so
+> far? Could you comment on what Coquand and Siles did in the paper
+> you cite?
+
+Coquand and Siles first spend 3 pages of their 16-pages paper
+developing a theory of inductively finite sets in type theory.
+This theory uses "discrete sets" where the equality needs to
+be decidable. They then view the ACI-properties as rewrite rules
+leading to a decidable equality over regular expressions. This
+shows (although they left out the details in their paper) that the
+set of derivatives is inductively finite.
+
+In our opinion, Coquand and Siles' development strongly justifies our
+comment about formalisation of Brzozowski's argument being painful.
+If you like to add this as a reference in the paper, we are happy
+to do so.
+
+> - Section 6, p20, previous to last paragraph: you say that the
+> construction of the complement regular expression involves several
+> transformations: you could shave off the construction of the
+> non-deterministic automaton, since it can be avoided using
+> Antimirov's construction of a DFA (as you do).
+
+We feel that our sequence adequately represents procedure
+that is used by informed outsiders and that is usually
+taught to students. Antimirov's construction using partial
+derivatives is unfortunately not widely known, which we hope
+we can help to change with this paper.
+
+> - (27) are properties, not definitions, isn't it?
+
+Corrected.
+
+> - A question. On the one hand, Isabelle/HOL features a classical
+> logic, with the law of the excluded middle and the Hilbert's epsilon
+> operator. On the other hand, in Coq, it is often involved to prove
+> decidability results, which may make it more difficult to formalize
+> some of the arguments you are using. Is it possible in Isabelle/HOL
+> to track what are the uses of classical logic that you use?
+
+In theory there is a way to track all classical uses in Isabelle,
+but this can be practically unfeasible, since proof terms can be
+gigantic (therefore the Isabelle/HOL follows the LCF approach
+where only theorems need to be recorded for correctness). Having
+said that there is one obvious place in our proof where we used
+a "classical" shortcut by using + for generating a single regular
+expression from a finite set of expressions. This would need better
+"explanation" in a constructive proof. There are no other inherent
+classical shortcuts in the proof. But as said before opening we
+think opening this can-of-worms is beside the focus of the paper.
+
+
+> In particular, I am worried about the last sentence on page 24: how do
+> you make these sets computational?
+
+We wrote:
+
+ However, since partial derivatives use sets of regular expressions, one
+ needs to carefully analyse whether the resulting algorithm is still
+ executable. Given the infrastructure for executable sets introduced by
+ Haftmann (2009) in Isabelle/HOL, it should.
+
+Does this allay your fears?
+
+> - I think a word of caution should be added to the comparison with
+> related works in terms of formalization size: there is a huge gap in
+> the scope of these formalizations efforts, especially when one
+> defines algorithms. For instance, an usual corollary of the
+> Myhill-Nerode theorem is an algorithm that minimize automata: could
+> you provide, for instance, an algorithm that compute the number of
+> equivalences classes of your Myhill-Nerode relation?
+
+We have attempted to be very careful with our comments about the
+formalisation sizes from other works. Our intention is only to draw
+attention to the somewhat surprising point, namely that formalisation
+using automata (either out of choice or being forced) led to
+formalisations where the people in question had to fight their
+theorem prover. On the other hand, our formalisation could rely
+on the fact that regular expressions are inductive datatypes which
+are generally well-supported by theorem prover reasoning. We hope
+to have "balanced the picture" by saying that regular expressions
+are not good for everything (see paragraph in the Conclusion
+"While regular expressions are convenient, they have some
+limitations...").
+
+We have not thought about an algorithm to compute the number of
+equivalence classes.
+
+> - There has been several work that uses matrices in Coq, in the
+> context of the ssreflect dialect. For instance, you may have a look
+> at Dénès M, Mörtberg A, Siles V. 2012. A refinement-based approach
+> to computational algebra in COQ. ITP 2012. It does not seem that
+> there is a particular problem with the use of matrices.
+
+We relied completely on the comment by (expert formaliser in Coq) Braibant
+from his thesis which explicitly states
+
+ | Note that while they [Wu, Zhang, Urban] justify their work by the fact
+ | that formalising automata ``can be a real hassle in HOL-based provers'',
+ | and that neither Isabelle/HOL nor HOL4 nor HOL-light support matrices and
+ | graphs with libraries, we did not encounter such difficulties in Coq
+ | (omitting the **intrinsic difficulty** of working with **rectangular matrices**
+ | at some points).
+
+The emphasis added by us. As far as we know Braibant also uses ssreflect.
+
+
+
+Reviewer #2
+===========
+
+> This is a very nice paper, at least in terms of its content, and it
+> should certainly be published in some shape or form. It's based on an
+> ITP conference paper from 2011, but seems to me to include sufficient
+> new content to justify republication in a journal. To be specific,
+> the new material is: the second of the two proofs showing the
+> implication that if a language is regular it has a finite number of
+> Myhill-Nerode equivalence classes (using partial derivatives), and all
+> of the closure properties. Both new contributions are very appealing.
+
+We agree with your observations.
+
+> Really, I have only a slew of minor criticisms about the paper, and
+> these almost all have to do with minor presentation issues. I started
+> to list some of these in the "Minor things" section below, but I
+> eventually decided that I should simply say that 1. the English is
+> sometimes less than perfect; 2. a spell-checker should be used; 3. the
+> typography is occasionally dubious; and that 4. it all needs to be
+> re-read with more care.
+
+We have improved the paper. Please let us know if we
+overlooked any more errors.
+
+> Finally, the conclusion and related work sections are reasonable. I
+> think the conclusion could usefully discuss the executability of the
+> algorithm in the first half of the M-N proof.
+
+Informally, the algorithm is clearly executable. But we have not
+attempted this with our formalisation.
+
+> Also, the conclusion discusses the Brzozowski's algebraic method in
+> this context, and argues that the equational constraints that appear
+> in this paper are not quite the same as a minimal automaton. It's
+> clear that, yes, they are a little different, but it still seems to
+> me that the initial state of the constraint set is isomorphic to the
+> automaton. Given that, I was very surprised to read (on p6) that
+> "Note that we do not define an automaton here".
+
+Yes, we agree, behind our equational system lurks a minimal automaton. What
+we wanted to clear up is that we have not defined a transition relation
+between states in the usual automata sense. Despite best intentions, we
+have deleted this remark, as it clearly misled the reviewers.
+
+> Appendix: this seems unnecessary. If it's interesting enough to be in
+> the main text, it should be there. Otherwise refer the reader to the
+> online formalisation. In this case, the appendix is so small that it
+> seems particularly silly to have one.
+
+We deleted the appendix.
\ No newline at end of file