Journal/JAR-response
author Christian Urban <christian dot urban at kcl dot ac dot uk>
Sun, 22 Dec 2013 07:37:26 +0000
changeset 393 058f29ab515c
parent 384 60bcf13adb77
permissions -rw-r--r--
added

We thank the reviewers for their time and insightful comments.
We have incorporated most comments and if not, then answered them 
below. We also spell-checked the paper. Thanks again for all the 
comments.

Reviewer #1
===========

> Indeed, one of the premise of the paper is that representing automata
> in a theorem prover requires to fight against the theorem prover or
> requires painful amount of details. Therefore, the authors define the
> notion of regular languages as a language generated by a regular
> expression. (This is not an unusual choice from a formalization point
> of view, provided complemented by a proof of Kleene's theorem that
> states such regular languages are exactly the one that are recognized
> by finite automata.)

We concur with the reviewer that our definition of regular languages 
as the ones generated by regular expressions is usual, but the paper is 
unusual in that we do not need to provide a version of Kleene's theorem. 
All results are obtained without formally needing a definition for automata. 
If we would need Kleene's theorem as part of our arguments, then we would
need such a definition. 

To make our intention with the paper clearer, we moved the quote 
by Gasarch from the Conclusion to the Introduction. In this quote he asks 
whether the complementation of regular-expression languages can be proved 
without using automata. In this way we sharpen our motivation by being 
theorem prover related, but also of general interest. See quotations and 
changes right after Definition 2, p. 3. 

> Second, the premise of this article is that formalizing automata
> theory require some fight with the theorem prover, and that several
> other (longer) formalizations from the related work substantiate this
> claim. I am under the impression that this comparison is misleading
> since some of the related work deals not only with regular language
> \emph{theory}, but also regular language \emph{algorithms}: e.g.,
> automata constructions, conversion from NFA to DFA, minimization,
> decision procedures for language equivalence. These algorithms may be
> proved correct, complete, or terminating; and may be more or less
> efficient. Then, these properties dictate some choices of
> representation of the objects of the formalization (and hence have
> consequences on its intricacies). 

Yes, we agree with the reviewer that *if* one is in the business of
proving the correctness of, for example, Hopcroft's algorithm or the
NFA-to-DFA conversion, which are all based on "concrete" notions of 
automata, then yes one clearly has to use the usual notion of the 
automata somewhere, even might have to start from it. 

However, we disagree with the interpretation that our comparison is 
misleading.  Our claim that usual 'paper' definitions do not translate 
smoothly into formalised definitions is not a claim just by us and
independent of any numbers: We referenced in the paper a concrete 
problem pointed out already in Slind's email cited on the bottom of 
p. 2. In the context of HOL4, he made the observation 

 | A student who is attempting to prove the pumping lemma for regular
 | languages came across the following problem. Suppose a DFA is
 | represented by a record  <| init ; trans ; accept |> where
 |
 |         init : 'state
 |         trans : 'state -> 'symbol -> 'state
 |         accept : 'state -> bool
 |
 | Execution of a DFA on a string:
 |
 |    (Execute machine [] s = s)
 |    (Execute machine (h::t) s = Execute machine t (machine.trans s h))
 |
 | Language recognized by a machine  D:
 |
 |   Lang(D) = {x | D.accept (Execute D x D.init)}
 |
 | These are all accepted by HOL with no problem. In then trying to define 
 | the set of regular languages, namely those accepted by some DFA, the 
 | following is natural to try
 |
 |    Regular (L:'symbol list -> bool) =  ?D. L = Lang(D)
 |
 | But this fails, since D has two type variables ('state and 'symbol) but
 | Regular is a predicate on 'symbol lists. So the principle of definition
 | rejects.

We also pointed to comments by Tobias Nipkow which clearly show that
he had to fight the theorem prover. Many formalisations we cited for 
comparison contain statements about having to fight their respective 
theorem prover. 
 
The conclusion we want the reader to draw from our comparisons is 
that if one is just interested in formalising results from regular language 
theory, not results from automata theory, then regular expressions are 
easier to work with formally. If you look at the numbers (especially the 
ones for Isabelle/HOL), then we hope you agree with us that the startup 
costs are much higher with automata in comparison with regular
expressions. We carefully commented on the examples we have cited in
the Conclusion. Nowhere did we claim that they are one-to-one comparisons with 
our development. We did, for example, say that Braibant agreed with us that 
*if* he had done just Myhill-Nerode using his library, he estimates our 
development is more concise (see his comment in the Conclusion, p. 25).  

> In this article, the authors choose
> to do the converse: avoid the notion of automata altogether, and see
> how far it goes. I find this choice a bit peculiar.

At first sight, our choice/approach might seem peculiar, since
we are not primarily verifying an algorithm. However, we are not
the only ones who pondered about this approach (we mentioned Gasarch). 
We showed that this is indeed a choice that can lead you quite far 
with regular language theory and lead to a relatively smooth
formalisation. We did not make the claim that regular expressions are 
a substitute for automata in all situations. See our paragraph
in the Conclusion starting with "While regular expressions are 
convenient, they have some limitations."

> Third, I wonder to what extent some of the proofs presented in this
> article are a rephrasing of standard proofs (that go through
> automata), inlining the notion of automata, e.g., the construction
> depicted in Section 3. Also, the notions of derivatives and partial
> derivatives underly some automata constructions: for instance, from my
> point of view, Section 5 is actually building a DFA (with transition
> relation $\delta$ and initial state ${r}$)from a regular expression
> without saying so; and it suffices to take as a tag function the
> expression $\lambda x.\delta({r},x)$ (which is a right invariant
> equivalence relation between strings, and of finite index since there
> finitely many states in the automaton). The authors take care to note
> the proximity of their arguments with existing ones when due (that
> often rely on automata), but they seem very close to the existing one
> in any case, with enough inlining. (Yet, I found the proof based on
> tagging functions quite elegant: it avoids the explicit construction
> of an automata--because automata are difficult to formalize in
> HOL--while being close to the intuitive automata constructions, even
> if they are not mentionned.)

We agree "that underlying our arguments are automata proofs", but 
we argued without constructing automata explicitly. This is the very point 
and contribution of the paper. If we construct automata explicitly, the 
formalised proof of Myhill-Nerode theorem will become much larger as  
can be seen in existing work. By abandoning automata and relying only on 
regular expressions, we get a much smoother formalised proof of MN theorem.

> To sum up, I think there is a significant value to come up with fresh
> arguments that makes it easier to formalize existing pen-and-paper
> proofs and textbooks. Yet, the arguments presented in this paper are
> not really new (they look like inlinings, even if the authors chose
> not to see it). And I am not convinced by the idea of side stepping an
> elegant tool, the notion of automata, because it is hard to
> formalize. I would rather improve on the underlying tools, or develop
> new ways to formalize this notion.

We disagree in that we should not take the opportunity of using 
regular expressions when they lead to simpler formalisations
in regular language theory. We have already found as one successful 
"application" that many structural inductions over regular expressions 
are good learning tools for students in order to come to grips with formal 
reasoning. 


> - Lemma 21 already appears in Brzozowski (1964) who attributes it to
>  Arden. I think there is little value in distinguishing the two
>  versions of Arden's lemma, since their proofs and statement are parallel.

We agree that Brzozowski (1964) gives the statement of Arden's lemma, but 
no proof, for which he refers to Arden (1960). Unfortunately, we do not have 
access to this book and guess that this applies to many contemporary readers.
Our reference to Sakarovitch is motivated by the fact that he gives 
explicitly a proof and his book is fairly recent. 

> - I am puzzled by the one-character transition relation between
>   equivalence classes (p. 6). The $X_i$ correspond to the states of
>   the minimal automaton that recognizes $L$ (and you even agree with
>   that in the Related Work section!). Therefore, I disagree with your
>   sentence stating that this construction does not build an
>   automaton. 

We deleted this comment since it also confused the second reviewer. 
What we wanted to express is that we have not defined a transition 
relation between states in the usual automata sense.

>  Then, it would follow that the initial system would be better
>  presented as a DFA (that is, associating to each equivalence class,
>  a mapping from atoms to equivalence classes with the atoms being
>  disjoint), even if you do not want to introduce the notion of
>  automaton. I fail to see the advantages of the current presentation
>  of the initial system. Then, I think that it would be easier to
>  understand your presentation if you used a more involved language
>  (like (ab + a)) as a running example. This would produce the
>  equivalence classes
>
>  X1 = {[]}
>  X2 = {[a]}
>  X3 = {[ab]}
>  X4 = {univ - {[], [a], [ab]}}
>
>  and the equational system
>
>  X1 = LAMBDA(ONE)
>  X2 = (X1,a)
>  X3 = (X2,b)
>  X4 = (X1,b) + (X2,a) + (X3,a) + (X3,b) 
> 

We have modified the example as you suggest. But note that assuming
we only have characters a and b in an alphabet, the equation for
X4 must be

    X4 = (X1,b) + (X2,a) + (X3,a) + (X3,b) + (X4,a) + (X4,b)


>  (It could be worth saying that it is a left-linear context free
>  grammar in any case). Having built this example, I wonder why you
>  chose to set up the equationnal system the way you did. In the RW
>  section, you argue that the differences with Brzozowski's are subtle
>  and impact the presentation of , but I fail to see what would go
>  wrong if you chose this presentation too. 

The difference is from where we start: In Brzozowski's algebraic 
method, the start is an automaton from which this method calculates
a regular expression. Therefore Brzozowski annotates the final
states with lambdas and `propagates' them to the single initial
state. From there he can read off a (single) regular expression recognising
the same language as the automaton. We, on the other hand, start with a 
set of equivalence classes, for all of which we want to calculate a 
regular expression. In this way we have to annotate the lambda to the 
initial state (containing only []) and `propagate' it to all other 
equivalence classes. Once there, we can read off the regular expressions
for each equivalence class. There seems to be no convenient way to set 
up the equational system the original Brzozowski-way. However, it does 
mean that we have to use the reversed version of Arden's lemma in order to 
solve our equational systems. 

> - p. 11 "There exists a regular expression" this is not clear, unless
>   one has a look at your formal proof: it should be made clear that
>  the definition "regular A" is defined as exists r, A = lang r. 

We clarified this sentence according to your suggestion. 

> - Top of p. 12: it is stated that this algorithm provides a method to
>  compute a regexp for the complement of a language and that this
>  construction is similar to the construction of the complement
>  automaton. I agree with this statement, because I am under the
>  impression (maybe wrongly) that the construction in this section
>  boils down to "automata with states indexed by the equivalences
>  classes" to "regular expression". What I am uncertain of is, under
>  what conditions I could use your formalization to compute this
>  regular expression. (Note that I am not speaking about complexity
>  issues here.)

We are not 100% sure whether we are correctly interpreting your
question: If you are after an informal "black board" calculation method 
of this regular expression in front of students, then you can use our
algorithm. We have done this already. If you are asking whether 
our formalisation is "constructive" enough in order to directly
extract an algorithm, then no, we cannot do this. We use a theorem
prover based on classical logic. We did not make, for example,
explicit how a single regular expression can be constructed from
a set of regular expressions, still recognising the same language.
However, we do not like to open the "can-of-worms" about constructive
and unconstructive arguments and how to extract explicit algorithms.
In theory we can, in practice we cannot. Unfortunately, we do not have
a better answer for this at the moment and think this is besides
the main point of the paper. 

> - p. 13 In the definition of the tagging function for the plus case,
>  isn't there a typo (replacing the \cdot by a comma) ? 

No. Our formalisation uses this definition generating a pair. 

> - p. 14 There is an "x" missing on the left-hand side of the definition of
>   x tag A B. 

Yes. Corrected.

> - The structure of section 4 is a bit difficult to follow, because the
>   proof of Theorem 31 is interleaved with the statement of many
>   intermediate lemmas. Could you separate the definition of the
>   tagging functions from the proof? 

We have added a figure at the beginning of the proof and included
appropriate references.

>   I think it would also be
>   interesting to point out that the type of the codomain the tag
>   functions are not the same through the section.  Again, I am under
>   the impression that the tagging functions defined in this section
>   define states of an automaton that is not built explicitly, with the
>   result of the tag function being a particular state in a suitably
>   built automaton.  

We added a relatively lengthy remark after the proof about 
how the tagging functions can be seen as constructing particular
automata - see p. 17.


>   Yet, it could be worth pointing out that this use
>   of polymorphism for the tag functions does not have the same
>   problems as the one encountered when one try to formalize
>   automata. This is in a sense contradiction with your statement
>   p. 24 that you sidestep the composition of automata (this automaton
>   is not explicitly built). 

We have modified the statement on p. 24.

> - p. 18: you mention the fact that it is painful to formalize
>   Brzozowski's argument stating that there are only finitely many
>   dissimilar derivatives. Is this because it requires to build part of
>   the equationnal theory of regular expressions that you avoided so
>   far? Could you comment on what Coquand and Siles did in the paper
>   you cite?  

Coquand and Siles first spend 3 pages of their 16-pages paper
developing a theory of inductively finite sets in type theory.
This theory uses "discrete sets" where the equality needs to 
be decidable. They then view the ACI-properties as rewrite rules 
leading to a decidable equality over regular expressions. This 
shows (although they left out the details in their paper) that the 
set of derivatives is inductively finite. 

In our opinion, Coquand and Siles' development strongly justifies our 
comment about formalisation of Brzozowski's argument being painful.
If you like to have this reference added in the paper, we are happy
to do so. 

> - Section 6, p20, previous to last paragraph: you say that the
>   construction of the complement regular expression involves several
>   transformations: you could shave off the construction of the
>   non-deterministic automaton, since it can be avoided using
>   Antimirov's construction of a DFA (as you do).

We feel that our sequence adequately represents the procedure
that is used by informed outsiders and that is usually 
taught to students. Antimirov's construction using partial
derivatives is unfortunately not widely known, which we hope
we can help to change with this paper.

> - (27) are properties, not definitions, isn't it? 

Yes, corrected. 

> - A question. On the one hand, Isabelle/HOL features a classical
>   logic, with the law of the excluded middle and the Hilbert's epsilon
>   operator. On the other hand, in Coq, it is often involved to prove
>   decidability results, which may make it more difficult to formalize
>   some of the arguments you are using. Is it possible in Isabelle/HOL
>   to track what are the uses of classical logic that you use? 

In theory there is a way to track all classical uses in Isabelle/HOL,
but this can be practically unfeasible, since proof terms can be 
gigantic (therefore the Isabelle/HOL follows the LCF approach 
where only theorems need to be recorded for correctness). Having 
said that, there is one obvious place in our proof where we used 
a "classical" shortcut by using + for generating a single regular 
expression from a finite set of expressions. This would need better 
"explanation" in a constructive proof. We also leave out an explicit 
choice for selecting an equation in the euqational system. But that
can be easily patched if wanted. To sum up, there are no inherent 
classical shortcuts in the proof. But as said before, we think opening  
this can-of-worms is beside the focus of the paper.


>  In particular, I am worried about the last sentence on page 24: how do
>  you make these sets computational? 

We wrote:

 However, since partial derivatives use sets of regular expressions, one 
 needs to carefully analyse whether the resulting algorithm is still 
 executable. Given the infrastructure for executable sets introduced by 
 Haftmann (2009) in Isabelle/HOL, it should.

Does this allay your fears?

> - I think a word of caution should be added to the comparison with
>   related works in terms of formalization size: there is a huge gap in
>   the scope of these formalizations efforts, especially when one
>   defines algorithms. For instance, an usual corollary of the
>   Myhill-Nerode theorem is an algorithm that minimize automata: could
>   you provide, for instance, an algorithm that compute the number of
>   equivalences classes of your Myhill-Nerode relation?  

We have attempted to be very careful with our comments about the
formalisation sizes from other works. Our point is to draw
attention to the somewhat surprising fact that formalisation
using automata (either out of choice or being forced) led to 
formalisations where the people in question had to fight their
theorem prover. On the other hand, our formalisation could rely
on the fact that regular expressions are inductive datatypes which
are generally well-supported by theorem prover reasoning. We hope
to have "balanced the picture" by saying that regular expressions
are not good for everything (see paragraph in the Conclusion
"While regular expressions are convenient, they have some 
limitations...").

We have not thought about an algorithm to compute the number of
equivalence classes. 

> - There has been several work that uses matrices in Coq, in the
>   context of the ssreflect dialect. For instance, you may have a look
>   at Dénès M, Mörtberg A, Siles V.  2012.  A refinement-based approach
>   to computational algebra in COQ. ITP 2012. It does not seem that
>   there is a particular problem with the use of matrices. 

We relied completely on the comment by (we assume expert formaliser in Coq) 
Braibant from his thesis which explicitly states: 

  | Note that while they [Wu, Zhang, Urban] justify their work by the fact  
  | that formalising automata ``can be a real hassle in HOL-based provers'', 
  | and that neither Isabelle/HOL nor HOL4 nor HOL-light support matrices and 
  | graphs with libraries, we did not encounter such difficulties in Coq 
  | (omitting the **intrinsic difficulty** of working with **rectangular matrices** 
  | at some points).

The emphasis added by us. As far as we know Braibant also uses ssreflect.



Reviewer #2
=========== 

> This is a very nice paper, at least in terms of its content, and it
> should certainly be published in some shape or form.  It's based on an
> ITP conference paper from 2011, but seems to me to include sufficient
> new content to justify republication in a journal.  To be specific,
> the new material is: the second of the two proofs showing the
> implication that if a language is regular it has a finite number of
> Myhill-Nerode equivalence classes (using partial derivatives), and all
> of the closure properties.  Both new contributions are very appealing.

We agree with your observations.

> Really, I have only a slew of minor criticisms about the paper, and
> these almost all have to do with minor presentation issues.  I started
> to list some of these in the "Minor things" section below, but I
> eventually decided that I should simply say that 1. the English is
> sometimes less than perfect; 2. a spell-checker should be used; 3. the
> typography is occasionally dubious; and that 4. it all needs to be
> re-read with more care.

We have improved the paper. Please let us know if we
overlooked any other errors.

> Finally, the conclusion and related work sections are reasonable.  I
> think the conclusion could usefully discuss the executability of the
> algorithm in the first half of the M-N proof.  

Informally, the algorithm is clearly executable. But we have not 
attempted this with our formalisation or modified the formalisation
so that it becomes executable by Isabelle/HOL.

> Also, the conclusion discusses the Brzozowski's algebraic method in 
> this context, and argues that the equational constraints that appear 
> in this paper are not quite the same as a minimal automaton. It's 
> clear that, yes, they are a little different, but it still seems to 
> me that the initial state of the constraint set is isomorphic to the 
> automaton.  Given that, I was very surprised to read (on p6) that 
> "Note that we do not define an automaton here".  

Yes, we agree, behind our equational system lurks a minimal automaton. What 
we wanted to clear up is that we have not defined a transition relation 
between states in the usual automata sense. Despite best intentions, we 
have deleted this remark, as it clearly misled the reviewers.

> Appendix: this seems unnecessary. If it's interesting enough to be in
> the main text, it should be there. Otherwise refer the reader to the
> online formalisation. In this case, the appendix is so small that it
> seems particularly silly to have one.

We deleted the appendix.