# HG changeset patch # User Christian Urban # Date 1379413318 -3600 # Node ID 0da31edd95b929b44da386953623047ccae789e1 # Parent 288637d9dcdee727e8434393ed92bc6010c6a9f5 new version diff -r 288637d9dcde -r 0da31edd95b9 Journal/JAR-response2 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/Journal/JAR-response2 Tue Sep 17 11:21:58 2013 +0100 @@ -0,0 +1,288 @@ + +> Reviewer #1: This paper demonstrates that a) the Myhill-Nerode theorem can be +> proved without relying on the notion of automata and b) argues +> shallowly thorough the paper that not having to formalize the notion +> of automata is easier. +> +> While I do agree that point a) is a an interesting theoretical +> contribution that deserves publication, I disagree with the second +> argument. + +We are very grateful to the reviewer deciding the paper is a worthy +contribution. We are especially grateful since this opinion is despite +disagreement with parts of the paper. Thank you very much! + +> It should be noted that the handling of the notion of automata has +> been difficult historically for two reasons. On the one hand, giving +> an identifier to every state of an automata indeed yields proofs that +> may be characterized as "clunky" (even if the wording is a bit +> strong), but is a necessary condition for efficient implementation of +> automata algorithms. This yields sizable formalizations as is +> demonstrated by the Lammich and Tuerk's work or Braibant and Pous's +> work [1]. On the other hand, it seems to be the case that the +> "principle of definition" in HOL makes it impossible to use parametric +> definitions of automata (but there is no problem with such a +> definition in Coq). + +We agree with this summary. We cited the work by Doczal et al and wrote +that their quantification over types is not available to us. + + Systems such as Coq permit quantification over types and thus can + state such a definition. This has been recently exploited in an elegant + and short formalisation of the Myhill-Nerode theorem in Coq by Doczkal + et al. (2013), but as said this is not available to us working in + Isabelle/HOL. + +> But, the startup cost of a formalization depends on the scope of what +> one intends to formalize. Perhaps Braibant concedes that your +> development is more concise that his, yet Braibant's work dealt with +> the formalization of an _efficient_ decision procedure for Kleene +> algebras that required _two_ formalizations of automata (the matrices +> are only used in the proof of correctness of this decision procedure, +> while the computational part uses efficient representations for DAGs +> -- see for instance [1] p. 21 or Braibant's thesis p. 49). I am only +> adding emphasis here to demonstrate that this does not compare easily +> with the formalization of the proof of Myhill-Nerode by itself. + +We have attempted to not touch upon efficiency and algorithmic issues, since +they are an distraction for our topic at hand, that is reasoning about inductive +structures, of which regular expressions are one instance. If you use more +efficient datastructures, then it is expected that reasoning will be +more difficult. The point of the paper is to prove the Myhill-Nerode theorem +with only inductive structures, which are well-supported in all interactive +theorem provers we are aware off. + +> (At this point, let me comment on your answer: "As far as we know +> Braibant also uses ssreflect". It is not the case. One may wonder to +> what extent the use of ssreflect there could have alleviated the +> "intrinsic difficulties of working with rectangular matrices".) + +Noted. We will not make this claim and have not done so in the paper. + +> In the same fashion, Lammich and Tuerk's work uses efficient +> data-structures (.e.g, for collections, automatas), and some amount of +> data-refinement to link various representations together. + +Please note that the 14 kloc we cited for the Lammich/Tuerk work is +the automata library, not the formalisation built on top of it. +Also Lammich and Tuerk have told us that reasoning about graphs +with state names is undesirable. If they had a better representation +available (which they had _not_, since they formalise Hopcroft's +algorithm), they would have used this one and establish that it is +"equivalent" to the more efficient version with efficient data- +structures. In their case there is no alternative formalising a library +about graphs etc. We have shown that in case of the Myhill-Nerode theorem +there is an alternative. We have been very careful to make claims about +being the first ones who have done so. However, we believe we are, +but certainly our proofs are not present in the literature (this +might have something to do with the fact that the Myhill-Nerode +theorem requires the reversal of Brzozowski and Arden...see +comment further below). + +> My most stringent remark about the paper is that it should make clear +> from the beginning whether or not the algorithms are computational, +> and if they are executable, how efficient. Then, comparing with the +> two aforementioned work will make sense. + +As stated earlier, we do not like to make computational issues +explicit as they are a distraction in our opinion. The logic behind +Isabelle/HOL and all HOL-based theorem provers is not a computational logic, +in the sense that it is not about computable functions (unlike Coq). There +are "uncomputable" features in HOL such as the axiom of choice (which we +make use of in our formalisation). We still use the terminology "algorithm" +in the widely-used sense of algorithms that contain "do-not-care" +nondetermism. For example, unification algorithms are usually explained and +proved correct as transition system over sets of equations. They are not +executable in the Turing-machine sense, since they leave implicit +the order in which the equations should be analysed and which +rules should "fire" if there are more than one applicable. They +are also "imperfect" as the notion of (potentially infinite) sets +cannot be implemented, in general, as a computable function, but +they can be reasoned about in Isabelle/HOL and other theorem provers. + +Still we hope that the reviewer agrees with us and with the many uses +in the literature (unification is only one example which commits this +"sin"), it is nevertheless perfectly sensible to call such "things" +algorithms. Although they cannot be directly implemented, they can be +implemented once choices are made explicit and datastructures +fixed. Similarly, our formalisation uses a do-not-care approach +for building a regular expression out of a set of regular expressions +(see equation (2)). It also leaves the order of equations that should be +analysed under-specified. But if all these choices are made explicit, we +have an algorithm that is implementable in code. But as the explanation +already indicates, it is quite a digression from our main topic of +reasoning about inductive structures. + +If we further contemplate the efficiency of the algorithm we +(under)specified, then our best guess is that it is exponential in the +worst case, just like the "naive" unification algorithm is exponential +if sharing is not treated carefully. + +> In fact, I have been made aware of the following work [2] that do +> formalize automata and regular language theory without using explicit +> identifiers for state numbers in automata. The upside is that it +> yields much smoother proofs than previous works in Coq. The downside +> is that this formalization strategy does not yield efficient algorithm +> for automata constructions. I reckon that this is a technical report +> that should be taken with a grain of salt in the context of this +> review, but I think that it gives evidence that it is not always the +> case that the startup cost is higher with automata rather than with +> regular expressions: they formalize the Myhill-Nerode theorem, +> Kleene's theorem, the relationships between various representations of +> regular languages in around 1500 lines of ssreflect/Coq. (Again, let +> me emphasize that this is not a computational representation of +> automata, and thus, it does not easily compare in terms of size with +> Lammich and Tuerk's work or Braibant and Pous's work.) I think this +> work would make for a much more relevant comparison with your work. + +We cited this work and commented on it (see above). + +> I agree that comparing with related work is hard, and that the authors +> have already been careful in their comparison with the formalization +> sizes from other works. Yet, I am afraid an uninformed outsider would +> miss the distinction between related work that are executable and +> efficient and this formalization. + +In order to get meaningful work done, we have to depend on readers +carefully reading the paper and obtaining required background material. +I am afraid, there is no other way around. + +> - I am still confused about the explanation about the way the +> equational system is set up (your comment in the conclusion does not +> help me to pinpoint why you have to do that): I understand that +> Brzozowski annotates the final states with lambdas, and propagates +> them to the unique initial state. On the other hand you start with a +> set of equivalence classes, and you want to compute a regular +> expression for all classes that are in finals. In other words, I still +> fail to see why the following system does not suit your needs +> +> X1 = a,X2 + b,X4 +> X2 = b,X3 + a,X4 + lambda(ONE) +> X3 = a,X4 + b,X4 + lambda(ONE) +> X4 = a,X4 + b,X4 +> +> with L(r,X) = L(r) \cdot X +> +> and then propagate the equation to compute to compute a value for +> X1. (And, you could draw the automata underlying the transitions at +> the top of p. 7 to help the reader understand what is going on, and +> the relationship with the current equational system.) +> +> Maybe it is simply that you could have done it, but that it would have +> been less pleasant to work with in the context of a Myhill-Nerode +> relation? + +We used in the paper the much stronger formulation of being +_forced_ to set up the equational system in our way. And we even +now have made the earlier comment about this issue, which was +placed in a footnote, to be part of the main text (see paragraph after +equation (6)). + +While your equational system and computations generate a regular +expression, it does not take into account the assumption from +the first direction of the Myhill-Nerode theorem. Recall that the +assumption is that there are finitely many equivalence classes +by the Myhill-Nerode relation. Therefore we have to make our equations +to be in agreement with these equivalence classes, which the properties +in equation (7) and (8) make explicit. We further have to maintain +this agreement during the solution of the equational system. + +Your equational system is _not_ in agreement with the equivalence +classes: consider that + + X1 = {[]} + X2 = {[a]} + X3 = {[a, b]} + X4 = UNIV − {[], [a], [a, b]} + +hold, which is not true for the equations you have set up (under the +natural interpretation you have given). Another point is that it is always the +case that only one equivalence class in the Myhill-Nerode theorem contains +the empty string. Therefore it cannot be the case that there are more than +one (initial) equation containing the marker lambda(ONE), which translates +into the empty string under the natural interpretation. So we are +of the opinion that for the Myhill-Nerode theorem you are really forced +to "reverse" the standard approaches. Please let us know, if this +clarifies this issue for you. + +> - I really appreciate the rephrasing of section 4, especially the +> final remark. I think it clearly helps the reader to understand the +> proof arguments, and the relationship with automata. + +Thank you. + +> - I concur with the second reviewer to say that the executability of +> the first half of the M-N theorem should be discussed in the +> conclusion. Moreover, I am puzzled by your comments in the response to +> reviewers: to what extent does your formalization need to be +> _modified_ to make it executable? Do you need to change the code, or +> to add more things? (More precisely, you discuss the executability of +> the second half of the M-N theorem -- that constructs a partition of +> the set of strings -- in the conclusion, in the context of Haftmann +> executable sets; + +The first half is about the Myhill-Nerode relation partitioning the sets of +all strings into equivalence classes. Our remark applies to this "half". + +> you should do the same for the first-half---that +> constructs a regular expression). I am not sure I am happy with the +> things you wrote about executable sets, because it is not clear from +> my point of view if you need to modify the code, or add things. + +See comment above. We really not want to open up this can of worms. + +> - More generally, it should be made clear that when you say algorithm, +> the reader should not understand it as something that is computable as +> it is. In particular, I think that the following sentence is +> misleading: "our algorithm for solving equational systems actually +> calculates a regular expression for the complement language." You +> actually define such an expression, but it seems not possible, as it +> is, to open the box to see what is inside. + +It is possible as long as you make the "do-not-care" non-determinism +explicit and fix appropriate datastructures for computations. + +> - You answered: +> > In our opinion, Coquand and Siles' development strongly justifies our +> > comment about formalisation of Brzozowski's argument being painful. +> > If you like to have this reference added in the paper, we are happy +> > to do so. +> +> I think that it is worth adding a comment like this one either in +> section 5 (at the beginning of p. 20) or in the conclusion. + +We added to page 20: + + Therefore Coquand and Siles (2011) who follow through this idea + had to spend extra effort and first formalise the quite intricate + notion of inductively finite sets in order to formalise this property. + +> - The authors provide a detailed figure for the number of line of code +> of Almeida et al's implementation of Mirkin construction, but there +> is no such figure for some of other works that are cited. Maybe this +> number could be dropped, or similar figures could be given for other +> works, in order to give a more accurate picture. + +We are not 100% sure about this comment. We have not selected the +explicit numbers for the Lammich-Tuerk and Almeida et al works because of +"name-and-shaming" or because of a "beauty contest". Rather to illustrate +a trend. We asked about the numbers for the Lammich-Tuerk work. They +also confirmed with us that their automata-with-names approach is painful +to work with. Unfortunately they have not written this in their paper +such that we could have cited it. Filliatre makes explicitly such a comment +in his paper, which we have cited. Braibant equally wrote that he reckons +that our formalisation is more concise. There is no way we can give a number +in terms of lines-of-code for the formalisation in Nuprl. We believe +the number for the Almeida et al work is accurate (and not even the +biggest formalisation from the ones listed, but one which is about +one specific topic). + +> - I wonder if you have direction for future works, that would satisfy +> your predicate "results from regular language theory, not results +> from automata theory". It would be nice to give a few examples if +> you have some in mind. + +We added the work by Sulzmann and Lu who work on regular expression +sub-matching using derivatives of regular expressions. We would like +to formalise this work. Thank you very much for suggesting future work. + diff -r 288637d9dcde -r 0da31edd95b9 Journal/Paper.thy --- a/Journal/Paper.thy Sat Sep 14 14:08:19 2013 +0100 +++ b/Journal/Paper.thy Tue Sep 17 11:21:58 2013 +0100 @@ -344,7 +344,7 @@ 2005.} Systems such as Coq permit quantification over types and thus can state such a definition. This has been recently exploited in an elegant and short formalisation of the Myhill-Nerode theorem in Coq by - \citeN{Smolka13}, but as said this is not avaible to us working in Isabelle/HOL. + \citeN{Smolka13}, but as said this is not available to us working in Isabelle/HOL. An alternative, which provides us with a single type for states of automata, is to give every state node an identity, for example a natural number, and @@ -1741,8 +1741,8 @@ \end{proof} \begin{rmk} - While our proof using tagging functions might seem like a rabbit pulled - out of a hat, the intuition behind can be somewhat rationalised by taking the + While our proof using tagging functions might seem like a `rabbit pulled + out of a hat', the intuition behind can be somewhat rationalised by taking the view that the equivalence classes @{term "UNIV // \(lang r)"} stand for the states of the minimal automaton for the language @{term "lang r"}. The theorem amounts to showing that this minimal automaton has finitely many states. @@ -1968,7 +1968,7 @@ but the one modulo @{text "ACI"}. In principle, this can be done formally, but it is very painful in a theorem prover---since there is no direct characterisation of the set of dissimilar derivatives. Therefore - \citeN{CoquandSiles12} who follow through this idea had to `stand on their heads' and + \citeN{CoquandSiles12} who follow through this idea had to spend extra effort and first formalise the quite intricate notion of \emph{inductively finite sets} in order to formalise this property. @@ -2315,7 +2315,7 @@ \noindent Our proof follows the one given by \citeN[Pages 92--95]{Shallit08}, except that we use Higman's Lemma, which is already proved in the Isabelle/HOL library by - Sternagel. + \citeN{Sternagel13}. Higman's Lemma allows us to infer that every language @{text A} of antichains, satisfying % \begin{equation}\label{higman} diff -r 288637d9dcde -r 0da31edd95b9 Journal/document/root.bib --- a/Journal/document/root.bib Sat Sep 14 14:08:19 2013 +0100 +++ b/Journal/document/root.bib Tue Sep 17 11:21:58 2013 +0100 @@ -445,3 +445,10 @@ year = {2013} } +@Unpublished{Sternagel13, + author = {C.~Sternagel}, + title = {{C}ertified {K}ruskal’s {T}ree {T}heorem}, + note = {Accepted for publication in {\it Proc.~of the 3rd International Conference on + Certified Programs and Proofs}}, + year = {2013} +} \ No newline at end of file diff -r 288637d9dcde -r 0da31edd95b9 journal.pdf Binary file journal.pdf has changed