|
1 We thank the reviewers for their time and insightful comments. |
|
2 We have incorporated most comments and if not, then answered them |
|
3 below. We also spell-checked the paper. Thanks again for all the |
|
4 comments. |
|
5 |
|
6 Reviewer #1 |
|
7 =========== |
|
8 |
|
9 > Indeed, one of the premise of the paper is that representing automata |
|
10 > in a theorem prover requires to fight against the theorem prover or |
|
11 > requires painful amount of details. Therefore, the authors define the |
|
12 > notion of regular languages as a language generated by a regular |
|
13 > expression. (This is not an unusual choice from a formalization point |
|
14 > of view, provided complemented by a proof of Kleene's theorem that |
|
15 > states such regular languages are exactly the one that are recognized |
|
16 > by finite automata.) |
|
17 |
|
18 We concur with the reviewer that our definition of regular languages |
|
19 as the ones generated by regular expressions is usual, but the paper is |
|
20 unusual in that we do not need to provide a version of Kleene's theorem. |
|
21 All results are obtained without formally needing a definition for automata. |
|
22 If we would need Kleene's theorem as part of our arguments, then we would |
|
23 need such a definition. |
|
24 |
|
25 To make our intention with the paper more clear, we moved the quote |
|
26 by Gasarch from the Conclusion to the Introduction. In this quote he asks |
|
27 whether the complementation of regular-expression languages can be proved |
|
28 without using automata. In this way we sharpen our motivation by being |
|
29 theorem prover related, but also of general interest. See quotations and |
|
30 changes right after Definition 2, p. 3. |
|
31 |
|
32 > Second, the premise of this article is that formalizing automata |
|
33 > theory require some fight with the theorem prover, and that several |
|
34 > other (longer) formalizations from the related work substantiate this |
|
35 > claim. I am under the impression that this comparison is misleading |
|
36 > since some of the related work deals not only with regular language |
|
37 > \emph{theory}, but also regular language \emph{algorithms}: e.g., |
|
38 > automata constructions, conversion from NFA to DFA, minimization, |
|
39 > decision procedures for language equivalence. These algorithms may be |
|
40 > proved correct, complete, or terminating; and may be more or less |
|
41 > efficient. Then, these properties dictate some choices of |
|
42 > representation of the objects of the formalization (and hence have |
|
43 > consequences on its intricacies). |
|
44 |
|
45 Yes, we agree with the reviewer that *if* one is in the business of |
|
46 proving the correctness of, for example, Hopcroft's algorithm or the |
|
47 NFA-to-DFA conversion, which are based on "concrete" automata, then yes |
|
48 one clearly has to use the usual notion of the automata somewhere, even |
|
49 might have to start from it. |
|
50 |
|
51 However, we disagree with the interpretation that our comparison is |
|
52 misleading. Our claim that usual `paper' definitions do not translate |
|
53 smoothly into formalised definitions is not a claim just by us and |
|
54 independent of any numbers: We referenced in the paper to a concrete |
|
55 problem pointed out already in Slind's email cited on the bottom of |
|
56 p. 2. In the context of HOL4, he made the observation |
|
57 |
|
58 | A student who is attempting to prove the pumping lemma for regular |
|
59 | languages came across the following problem. Suppose a DFA is |
|
60 | represented by a record <| init ; trans ; accept |> where |
|
61 | |
|
62 | init : 'state |
|
63 | trans : 'state -> 'symbol -> 'state |
|
64 | accept : 'state -> bool |
|
65 | |
|
66 | Execution of a DFA on a string: |
|
67 | |
|
68 | (Execute machine [] s = s) |
|
69 | (Execute machine (h::t) s = Execute machine t (machine.trans s h)) |
|
70 | |
|
71 | Language recognized by a machine D: |
|
72 | |
|
73 | Lang(D) = {x | D.accept (Execute D x D.init)} |
|
74 | |
|
75 | These are all accepted by HOL with no problem. In then trying to define |
|
76 | the set of regular languages, namely those accepted by some DFA, the |
|
77 | following is natural to try |
|
78 | |
|
79 | Regular (L:'symbol list -> bool) = ?D. L = Lang(D) |
|
80 | |
|
81 | But this fails, since D has two type variables ('state and 'symbol) but |
|
82 | Regular is a predicate on 'symbol lists. So the principle of definition |
|
83 | rejects. |
|
84 |
|
85 We also pointed to comments by Tobias Nipkow which clearly show that |
|
86 he had to fight the theorem prover. Many formalisations we cited for |
|
87 comparison contain statements about having to fight their respective |
|
88 theorem prover. |
|
89 |
|
90 The conclusion we want the reader to draw from our comparisons is |
|
91 that if one is just interested in formalising results from regular language |
|
92 theory, not results from automata theory, then regular expressions are |
|
93 easier to work with formally. If you look at the numbers (especially the |
|
94 ones for Isabelle/HOL), then we hope you agree with us that the startup |
|
95 costs are much higher with automata in comparison with regular |
|
96 expressions. We carefully commented on the examples we have cited in |
|
97 the Conclusion. Nowhere did we claim that they are one-to-one comparisons with |
|
98 our development. We did, for example, say that Braibant agreed with us that |
|
99 *if* he had done just Myhill-Nerode using his library, he estimates our |
|
100 development is more concise (see his comment in the Conclusion, p. 25). |
|
101 |
|
102 > In this article, the authors choose |
|
103 > to do the converse: avoid the notion of automata altogether, and see |
|
104 > how far it goes. I find this choice a bit peculiar. |
|
105 |
|
106 At first sight, our choice/approach might seem peculiar, since |
|
107 we are not primarily verifying an algorithm. However, we are not |
|
108 the only ones who pondered about this approach (we mentioned Gasarch). |
|
109 We showed that this is indeed a choice that can lead you quite far |
|
110 with regular language theory and lead to a relatively smooth |
|
111 formalisation. We did not make the claim that regular expressions are |
|
112 a substitute for automata in all situations. See our paragraph |
|
113 in the Conclusion starting with "While regular expressions are |
|
114 convenient, they have some limitations." |
|
115 |
|
116 > Third, I wonder to what extent some of the proofs presented in this |
|
117 > article are a rephrasing of standard proofs (that go through |
|
118 > automata), inlining the notion of automata, e.g., the construction |
|
119 > depicted in Section 3. Also, the notions of derivatives and partial |
|
120 > derivatives underly some automata constructions: for instance, from my |
|
121 > point of view, Section 5 is actually building a DFA (with transition |
|
122 > relation $\delta$ and initial state ${r}$)from a regular expression |
|
123 > without saying so; and it suffices to take as a tag function the |
|
124 > expression $\lambda x.\delta({r},x)$ (which is a right invariant |
|
125 > equivalence relation between strings, and of finite index since there |
|
126 > finitely many states in the automaton). The authors take care to note |
|
127 > the proximity of their arguments with existing ones when due (that |
|
128 > often rely on automata), but they seem very close to the existing one |
|
129 > in any case, with enough inlining. (Yet, I found the proof based on |
|
130 > tagging functions quite elegant: it avoids the explicit construction |
|
131 > of an automata--because automata are difficult to formalize in |
|
132 > HOL--while being close to the intuitive automata constructions, even |
|
133 > if they are not mentionned.) |
|
134 |
|
135 We agree "that underlying our arguments are automata proofs", but |
|
136 we argued without constructing automata explicitly. This is the very point |
|
137 and contribution of the paper. If we construct automata explicitly, the |
|
138 formalised proof of Myhill-Nerode theorem will become much larger as |
|
139 can be seen in existing work. By abandoning automata and relying only on |
|
140 regular expressions, we get a much simpler formalised proof of MN theorem. |
|
141 |
|
142 > To sum up, I think there is a significant value to come up with fresh |
|
143 > arguments that makes it easier to formalize existing pen-and-paper |
|
144 > proofs and textbooks. Yet, the arguments presented in this paper are |
|
145 > not really new (they look like inlinings, even if the authors chose |
|
146 > not to see it). And I am not convinced by the idea of side stepping an |
|
147 > elegant tool, the notion of automata, because it is hard to |
|
148 > formalize. I would rather improve on the underlying tools, or develop |
|
149 > new ways to formalize this notion. |
|
150 |
|
151 We disagree in that we should not take the opportunity of using |
|
152 regular expressions when they lead to simpler formalisations |
|
153 in regular language theory. We have already found that many structural |
|
154 inductions that only regular expressions afford are good learning |
|
155 tools for students in order to come to grips with formal reasoning. |
|
156 |
|
157 |
|
158 > - Lemma 21 already appears in Brzozowski (1964) who attributes it to |
|
159 > Arden. I think there is little value in distinguishing the two |
|
160 > versions of Arden's lemma, since their proofs and statement are parallel. |
|
161 |
|
162 We agree that Brzozowski (1964) gives the statement of Arden's lemma, but |
|
163 no proof, for which he refers to Arden (1960). Unfortunately, we do not have |
|
164 access to this book and guess that this applies to many contemporary readers. |
|
165 Our reference to Sakarovitch is motivated by the fact that he gives |
|
166 explicitly a proof (XXX CHECK). |
|
167 |
|
168 > - I am puzzled by the one-character transition relation between |
|
169 > equivalence classes (p. 6). The $X_i$ correspond to the states of |
|
170 > the minimal automaton that recognizes $L$ (and you even agree with |
|
171 > that in the Related Work section!). Therefore, I disagree with your |
|
172 > sentence stating that this construction does not build an |
|
173 > automaton. |
|
174 |
|
175 We deleted this comment since it also confused the second reviewer. |
|
176 What we wanted to express is that we have not defined a transition |
|
177 relation between states in the usual automata sense. |
|
178 |
|
179 > Then, it would follow that the initial system would be better |
|
180 > presented as a DFA (that is, associating to each equivalence class, |
|
181 > a mapping from atoms to equivalence classes with the atoms being |
|
182 > disjoint), even if you do not want to introduce the notion of |
|
183 > automaton. I fail to see the advantages of the current presentation |
|
184 > of the initial system. Then, I think that it would be easier to |
|
185 > understand your presentation if you used a more involved language |
|
186 > (like (ab + a)) as a running example. This would produce the |
|
187 > equivalence classes |
|
188 > |
|
189 > X1 = {[]} |
|
190 > X2 = {[a]} |
|
191 > X3 = {[ab]} |
|
192 > X4 = {univ - {[], [a], [ab]}} |
|
193 > |
|
194 > and the equational system |
|
195 > |
|
196 > X1 = LAMBDA(ONE) |
|
197 > X2 = (X1,a) |
|
198 > X3 = (X2,b) |
|
199 > X4 = (X1,b) + (X2,a) + (X3,a) + (X3,b) |
|
200 > |
|
201 |
|
202 We have modified the example as you suggest. But note that assuming |
|
203 we only have characters a and b in an alphabet, the equation for |
|
204 X4 must be |
|
205 |
|
206 X4 = (X1,b) + (X2,a) + (X3,a) + (X3,b) + (X4,a) + (X4,b) |
|
207 |
|
208 |
|
209 > (It could be worth saying that it is a left-linear context free |
|
210 > grammar in any case). Having built this example, I wonder why you |
|
211 > chose to set up the equationnal system the way you did. In the RW |
|
212 > section, you argue that the differences with Brzozowski's are subtle |
|
213 > and impact the presentation of , but I fail to see what would go |
|
214 > wrong if you chose this presentation too. |
|
215 |
|
216 The difference is from where we start: In Brzozowski's algebraic |
|
217 method, the start is an automaton from which this method calculates |
|
218 a regular expression. Therefore Brzozowski annotates the final |
|
219 states with lambdas and `propagates' them to the single initial |
|
220 state. From there he can read off a (single) regular expression recognising |
|
221 the same language as the automaton. We, on the other hand, start with a |
|
222 set of equivalence classes, for all of which we want to calculate a |
|
223 regular expression. In this way we have to annotate the lambda to the |
|
224 initial state (containing only []) and `propagate' it to all other |
|
225 equivalence classes. Once there, we can read off the regular expressions |
|
226 for each equivalence class. There seems to be no convenient way to set |
|
227 up the equational system the original Brzozowski-way. However, it does |
|
228 mean that we have to use the reversed version of Arden's lemma in order to |
|
229 solve our equational systems. |
|
230 |
|
231 > - p. 11 "There exists a regular expression" this is not clear, unless |
|
232 > one has a look at your formal proof: it should be made clear that |
|
233 > the definition "regular A" is defined as exists r, A = lang r. |
|
234 |
|
235 We clarified this sentence according to your suggestion. |
|
236 |
|
237 > - Top of p. 12: it is stated that this algorithm provides a method to |
|
238 > compute a regexp for the complement of a language and that this |
|
239 > construction is similar to the construction of the complement |
|
240 > automaton. I agree with this statement, because I am under the |
|
241 > impression (maybe wrongly) that the construction in this section |
|
242 > boils down to "automata with states indexed by the equivalences |
|
243 > classes" to "regular expression". What I am uncertain of is, under |
|
244 > what conditions I could use your formalization to compute this |
|
245 > regular expression. (Note that I am not speaking about complexity |
|
246 > issues here.) |
|
247 |
|
248 We are not 100% sure whether we are correctly interpreting your |
|
249 question: If you are after an informal "black board" calculation method |
|
250 of this regular expression in front of students, then you can use our |
|
251 algorithm. We have done this already. If you are asking whether |
|
252 our formalisation is "constructive" enough in order to directly |
|
253 extract an algorithm, then no we cannot do this. We use a theorem |
|
254 prover based on classical logic. We, for example, did not make |
|
255 explicit how a single regular expression can be constructed from |
|
256 a set of regular expressions, still recognising the same language. |
|
257 However, we do not like to open the "can-of-worms" about constructive |
|
258 and unconstructive arguments and how to extract explicit algorithms. |
|
259 In theory we can, in practice we cannot. Unfortunately, we do not have |
|
260 a better answer for this at the moment and think this is besides |
|
261 the main point of the paper. |
|
262 |
|
263 > - It seems that the purpose of the tagging functions here is to build |
|
264 > right invariant equivalence relations that refines the Myhill-Nerode |
|
265 > relation (Lemma 43). Maybe you should start by that. Also, any such |
|
266 > right invariant relation naturally gives rise to an automaton, and |
|
267 > you build such later on. Maybe you could phrase it out at the very |
|
268 > beginning of this section. |
|
269 |
|
270 We added a long remark at the end of the section that spells |
|
271 out the automata inspiration behind the tagging functions (see p. 17). |
|
272 |
|
273 > - p. 13 In the definition of the tagging function for the plus case, |
|
274 > isn't there a typo (replacing the \cdot by a comma) ? |
|
275 |
|
276 No. Our formalisation uses this definition. |
|
277 |
|
278 > - p. 14 There is an "x" missing on the left-hand side of the definition of |
|
279 > x tag A B. |
|
280 |
|
281 Yes. Corrected. |
|
282 |
|
283 > - The structure of section 4 is a bit difficult to follow, because the |
|
284 > proof of Theorem 31 is interleaved with the statement of many |
|
285 > intermediate lemmas. Could you separate the definition of the |
|
286 > tagging functions from the proof? |
|
287 |
|
288 We have added a figure at the beginning of the proof and included |
|
289 appropriate references. |
|
290 |
|
291 > I think it would also be |
|
292 > interesting to point out that the type of the codomain the tag |
|
293 > functions are not the same through the section. Again, I am under |
|
294 > the impression that the tagging functions defined in this section |
|
295 > define states of an automaton that is not built explicitly, with the |
|
296 > result of the tag function being a particular state in a suitably |
|
297 > built automaton. |
|
298 |
|
299 We added a relatively lengthy remark after the proof about |
|
300 how the tagging functions can be seen as constructing particular |
|
301 automata - see p. 17. |
|
302 |
|
303 |
|
304 > Yet, it could be worth pointing out that this use |
|
305 > of polymorphism for the tag functions does not have the same |
|
306 > problems as the one encountered when one try to formalize |
|
307 > automata. This is in a sense contradiction with your statement |
|
308 > p. 24 that you sidestep the composition of automata (this automaton |
|
309 > is not explicitly built). |
|
310 |
|
311 We have modified the statement on p. 24. |
|
312 |
|
313 > - p. 18: you mention the fact that it is painful to formalize |
|
314 > Brzozowski's argument stating that there are only finitely many |
|
315 > dissimilar derivatives. Is this because it requires to build part of |
|
316 > the equationnal theory of regular expressions that you avoided so |
|
317 > far? Could you comment on what Coquand and Siles did in the paper |
|
318 > you cite? |
|
319 |
|
320 Coquand and Siles first spend 3 pages of their 16-pages paper |
|
321 developing a theory of inductively finite sets in type theory. |
|
322 This theory uses "discrete sets" where the equality needs to |
|
323 be decidable. They then view the ACI-properties as rewrite rules |
|
324 leading to a decidable equality over regular expressions. This |
|
325 shows (although they left out the details in their paper) that the |
|
326 set of derivatives is inductively finite. |
|
327 |
|
328 In our opinion, Coquand and Siles' development strongly justifies our |
|
329 comment about formalisation of Brzozowski's argument being painful. |
|
330 If you like to add this as a reference in the paper, we are happy |
|
331 to do so. |
|
332 |
|
333 > - Section 6, p20, previous to last paragraph: you say that the |
|
334 > construction of the complement regular expression involves several |
|
335 > transformations: you could shave off the construction of the |
|
336 > non-deterministic automaton, since it can be avoided using |
|
337 > Antimirov's construction of a DFA (as you do). |
|
338 |
|
339 We feel that our sequence adequately represents procedure |
|
340 that is used by informed outsiders and that is usually |
|
341 taught to students. Antimirov's construction using partial |
|
342 derivatives is unfortunately not widely known, which we hope |
|
343 we can help to change with this paper. |
|
344 |
|
345 > - (27) are properties, not definitions, isn't it? |
|
346 |
|
347 Corrected. |
|
348 |
|
349 > - A question. On the one hand, Isabelle/HOL features a classical |
|
350 > logic, with the law of the excluded middle and the Hilbert's epsilon |
|
351 > operator. On the other hand, in Coq, it is often involved to prove |
|
352 > decidability results, which may make it more difficult to formalize |
|
353 > some of the arguments you are using. Is it possible in Isabelle/HOL |
|
354 > to track what are the uses of classical logic that you use? |
|
355 |
|
356 In theory there is a way to track all classical uses in Isabelle, |
|
357 but this can be practically unfeasible, since proof terms can be |
|
358 gigantic (therefore the Isabelle/HOL follows the LCF approach |
|
359 where only theorems need to be recorded for correctness). Having |
|
360 said that there is one obvious place in our proof where we used |
|
361 a "classical" shortcut by using + for generating a single regular |
|
362 expression from a finite set of expressions. This would need better |
|
363 "explanation" in a constructive proof. There are no other inherent |
|
364 classical shortcuts in the proof. But as said before opening we |
|
365 think opening this can-of-worms is beside the focus of the paper. |
|
366 |
|
367 |
|
368 > In particular, I am worried about the last sentence on page 24: how do |
|
369 > you make these sets computational? |
|
370 |
|
371 We wrote: |
|
372 |
|
373 However, since partial derivatives use sets of regular expressions, one |
|
374 needs to carefully analyse whether the resulting algorithm is still |
|
375 executable. Given the infrastructure for executable sets introduced by |
|
376 Haftmann (2009) in Isabelle/HOL, it should. |
|
377 |
|
378 Does this allay your fears? |
|
379 |
|
380 > - I think a word of caution should be added to the comparison with |
|
381 > related works in terms of formalization size: there is a huge gap in |
|
382 > the scope of these formalizations efforts, especially when one |
|
383 > defines algorithms. For instance, an usual corollary of the |
|
384 > Myhill-Nerode theorem is an algorithm that minimize automata: could |
|
385 > you provide, for instance, an algorithm that compute the number of |
|
386 > equivalences classes of your Myhill-Nerode relation? |
|
387 |
|
388 We have attempted to be very careful with our comments about the |
|
389 formalisation sizes from other works. Our intention is only to draw |
|
390 attention to the somewhat surprising point, namely that formalisation |
|
391 using automata (either out of choice or being forced) led to |
|
392 formalisations where the people in question had to fight their |
|
393 theorem prover. On the other hand, our formalisation could rely |
|
394 on the fact that regular expressions are inductive datatypes which |
|
395 are generally well-supported by theorem prover reasoning. We hope |
|
396 to have "balanced the picture" by saying that regular expressions |
|
397 are not good for everything (see paragraph in the Conclusion |
|
398 "While regular expressions are convenient, they have some |
|
399 limitations..."). |
|
400 |
|
401 We have not thought about an algorithm to compute the number of |
|
402 equivalence classes. |
|
403 |
|
404 > - There has been several work that uses matrices in Coq, in the |
|
405 > context of the ssreflect dialect. For instance, you may have a look |
|
406 > at Dénès M, Mörtberg A, Siles V. 2012. A refinement-based approach |
|
407 > to computational algebra in COQ. ITP 2012. It does not seem that |
|
408 > there is a particular problem with the use of matrices. |
|
409 |
|
410 We relied completely on the comment by (expert formaliser in Coq) Braibant |
|
411 from his thesis which explicitly states |
|
412 |
|
413 | Note that while they [Wu, Zhang, Urban] justify their work by the fact |
|
414 | that formalising automata ``can be a real hassle in HOL-based provers'', |
|
415 | and that neither Isabelle/HOL nor HOL4 nor HOL-light support matrices and |
|
416 | graphs with libraries, we did not encounter such difficulties in Coq |
|
417 | (omitting the **intrinsic difficulty** of working with **rectangular matrices** |
|
418 | at some points). |
|
419 |
|
420 The emphasis added by us. As far as we know Braibant also uses ssreflect. |
|
421 |
|
422 |
|
423 |
|
424 Reviewer #2 |
|
425 =========== |
|
426 |
|
427 > This is a very nice paper, at least in terms of its content, and it |
|
428 > should certainly be published in some shape or form. It's based on an |
|
429 > ITP conference paper from 2011, but seems to me to include sufficient |
|
430 > new content to justify republication in a journal. To be specific, |
|
431 > the new material is: the second of the two proofs showing the |
|
432 > implication that if a language is regular it has a finite number of |
|
433 > Myhill-Nerode equivalence classes (using partial derivatives), and all |
|
434 > of the closure properties. Both new contributions are very appealing. |
|
435 |
|
436 We agree with your observations. |
|
437 |
|
438 > Really, I have only a slew of minor criticisms about the paper, and |
|
439 > these almost all have to do with minor presentation issues. I started |
|
440 > to list some of these in the "Minor things" section below, but I |
|
441 > eventually decided that I should simply say that 1. the English is |
|
442 > sometimes less than perfect; 2. a spell-checker should be used; 3. the |
|
443 > typography is occasionally dubious; and that 4. it all needs to be |
|
444 > re-read with more care. |
|
445 |
|
446 We have improved the paper. Please let us know if we |
|
447 overlooked any more errors. |
|
448 |
|
449 > Finally, the conclusion and related work sections are reasonable. I |
|
450 > think the conclusion could usefully discuss the executability of the |
|
451 > algorithm in the first half of the M-N proof. |
|
452 |
|
453 Informally, the algorithm is clearly executable. But we have not |
|
454 attempted this with our formalisation. |
|
455 |
|
456 > Also, the conclusion discusses the Brzozowski's algebraic method in |
|
457 > this context, and argues that the equational constraints that appear |
|
458 > in this paper are not quite the same as a minimal automaton. It's |
|
459 > clear that, yes, they are a little different, but it still seems to |
|
460 > me that the initial state of the constraint set is isomorphic to the |
|
461 > automaton. Given that, I was very surprised to read (on p6) that |
|
462 > "Note that we do not define an automaton here". |
|
463 |
|
464 Yes, we agree, behind our equational system lurks a minimal automaton. What |
|
465 we wanted to clear up is that we have not defined a transition relation |
|
466 between states in the usual automata sense. Despite best intentions, we |
|
467 have deleted this remark, as it clearly misled the reviewers. |
|
468 |
|
469 > Appendix: this seems unnecessary. If it's interesting enough to be in |
|
470 > the main text, it should be there. Otherwise refer the reader to the |
|
471 > online formalisation. In this case, the appendix is so small that it |
|
472 > seems particularly silly to have one. |
|
473 |
|
474 We deleted the appendix. |