# HG changeset patch
# User Christian Urban <christian dot urban at kcl dot ac dot uk>
# Date 1373474850 -3600
# Node ID 99161cd17c0fcb51c3f945720fd3f08dc9153f0f
# Parent  c028b07a4fa81df74cdd9389037c7c3b880c2f81
added modified version adn answer to the reviewers

diff -r c028b07a4fa8 -r 99161cd17c0f Journal/JAR-response
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/Journal/JAR-response	Wed Jul 10 17:47:30 2013 +0100
@@ -0,0 +1,474 @@
+We thank the reviewers for their time and insightful comments.
+We have incorporated most comments and if not, then answered them 
+below. We also spell-checked the paper. Thanks again for all the 
+comments.
+
+Reviewer #1
+===========
+
+> Indeed, one of the premise of the paper is that representing automata
+> in a theorem prover requires to fight against the theorem prover or
+> requires painful amount of details. Therefore, the authors define the
+> notion of regular languages as a language generated by a regular
+> expression. (This is not an unusual choice from a formalization point
+> of view, provided complemented by a proof of Kleene's theorem that
+> states such regular languages are exactly the one that are recognized
+> by finite automata.)
+
+We concur with the reviewer that our definition of regular languages 
+as the ones generated by regular expressions is usual, but the paper is 
+unusual in that we do not need to provide a version of Kleene's theorem. 
+All results are obtained without formally needing a definition for automata. 
+If we would need Kleene's theorem as part of our arguments, then we would
+need such a definition. 
+
+To make our intention with the paper more clear, we moved the quote 
+by Gasarch from the Conclusion to the Introduction. In this quote he asks 
+whether the complementation of regular-expression languages can be proved 
+without using automata. In this way we sharpen our motivation by being 
+theorem prover related, but also of general interest. See quotations and 
+changes right after Definition 2, p. 3. 
+
+> Second, the premise of this article is that formalizing automata
+> theory require some fight with the theorem prover, and that several
+> other (longer) formalizations from the related work substantiate this
+> claim. I am under the impression that this comparison is misleading
+> since some of the related work deals not only with regular language
+> \emph{theory}, but also regular language \emph{algorithms}: e.g.,
+> automata constructions, conversion from NFA to DFA, minimization,
+> decision procedures for language equivalence. These algorithms may be
+> proved correct, complete, or terminating; and may be more or less
+> efficient. Then, these properties dictate some choices of
+> representation of the objects of the formalization (and hence have
+> consequences on its intricacies). 
+
+Yes, we agree with the reviewer that *if* one is in the business of
+proving the correctness of, for example, Hopcroft's algorithm or the
+NFA-to-DFA conversion, which are based on "concrete" automata, then yes 
+one clearly has to use the usual notion of the automata somewhere, even 
+might have to start from it. 
+
+However, we disagree with the interpretation that our comparison is 
+misleading.  Our claim that usual `paper' definitions do not translate 
+smoothly into formalised definitions is not a claim just by us and
+independent of any numbers: We referenced in the paper to a concrete 
+problem pointed out already in Slind's email cited on the bottom of 
+p. 2. In the context of HOL4, he made the observation 
+
+ | A student who is attempting to prove the pumping lemma for regular
+ | languages came across the following problem. Suppose a DFA is
+ | represented by a record  <| init ; trans ; accept |> where
+ |
+ |         init : 'state
+ |         trans : 'state -> 'symbol -> 'state
+ |         accept : 'state -> bool
+ |
+ | Execution of a DFA on a string:
+ |
+ |    (Execute machine [] s = s)
+ |    (Execute machine (h::t) s = Execute machine t (machine.trans s h))
+ |
+ | Language recognized by a machine  D:
+ |
+ |   Lang(D) = {x | D.accept (Execute D x D.init)}
+ |
+ | These are all accepted by HOL with no problem. In then trying to define 
+ | the set of regular languages, namely those accepted by some DFA, the 
+ | following is natural to try
+ |
+ |    Regular (L:'symbol list -> bool) =  ?D. L = Lang(D)
+ |
+ | But this fails, since D has two type variables ('state and 'symbol) but
+ | Regular is a predicate on 'symbol lists. So the principle of definition
+ | rejects.
+
+We also pointed to comments by Tobias Nipkow which clearly show that
+he had to fight the theorem prover. Many formalisations we cited for 
+comparison contain statements about having to fight their respective 
+theorem prover. 
+ 
+The conclusion we want the reader to draw from our comparisons is 
+that if one is just interested in formalising results from regular language 
+theory, not results from automata theory, then regular expressions are 
+easier to work with formally. If you look at the numbers (especially the 
+ones for Isabelle/HOL), then we hope you agree with us that the startup 
+costs are much higher with automata in comparison with regular
+expressions. We carefully commented on the examples we have cited in
+the Conclusion. Nowhere did we claim that they are one-to-one comparisons with 
+our development. We did, for example, say that Braibant agreed with us that 
+*if* he had done just Myhill-Nerode using his library, he estimates our 
+development is more concise (see his comment in the Conclusion, p. 25).  
+
+> In this article, the authors choose
+> to do the converse: avoid the notion of automata altogether, and see
+> how far it goes. I find this choice a bit peculiar.
+
+At first sight, our choice/approach might seem peculiar, since
+we are not primarily verifying an algorithm. However, we are not
+the only ones who pondered about this approach (we mentioned Gasarch). 
+We showed that this is indeed a choice that can lead you quite far 
+with regular language theory and lead to a relatively smooth
+formalisation. We did not make the claim that regular expressions are 
+a substitute for automata in all situations. See our paragraph
+in the Conclusion starting with "While regular expressions are 
+convenient, they have some limitations."
+
+> Third, I wonder to what extent some of the proofs presented in this
+> article are a rephrasing of standard proofs (that go through
+> automata), inlining the notion of automata, e.g., the construction
+> depicted in Section 3. Also, the notions of derivatives and partial
+> derivatives underly some automata constructions: for instance, from my
+> point of view, Section 5 is actually building a DFA (with transition
+> relation $\delta$ and initial state ${r}$)from a regular expression
+> without saying so; and it suffices to take as a tag function the
+> expression $\lambda x.\delta({r},x)$ (which is a right invariant
+> equivalence relation between strings, and of finite index since there
+> finitely many states in the automaton). The authors take care to note
+> the proximity of their arguments with existing ones when due (that
+> often rely on automata), but they seem very close to the existing one
+> in any case, with enough inlining. (Yet, I found the proof based on
+> tagging functions quite elegant: it avoids the explicit construction
+> of an automata--because automata are difficult to formalize in
+> HOL--while being close to the intuitive automata constructions, even
+> if they are not mentionned.)
+
+We agree "that underlying our arguments are automata proofs", but 
+we argued without constructing automata explicitly. This is the very point 
+and contribution of the paper. If we construct automata explicitly, the 
+formalised proof of Myhill-Nerode theorem will become much larger as  
+can be seen in existing work. By abandoning automata and relying only on 
+regular  expressions, we get a much simpler formalised proof of MN theorem.
+
+> To sum up, I think there is a significant value to come up with fresh
+> arguments that makes it easier to formalize existing pen-and-paper
+> proofs and textbooks. Yet, the arguments presented in this paper are
+> not really new (they look like inlinings, even if the authors chose
+> not to see it). And I am not convinced by the idea of side stepping an
+> elegant tool, the notion of automata, because it is hard to
+> formalize. I would rather improve on the underlying tools, or develop
+> new ways to formalize this notion.
+
+We disagree in that we should not take the opportunity of using 
+regular expressions when they lead to simpler formalisations
+in regular language theory. We have already found that many structural 
+inductions that only regular expressions afford are good learning 
+tools for students in order to come to grips with formal reasoning. 
+
+
+> - Lemma 21 already appears in Brzozowski (1964) who attributes it to
+>  Arden. I think there is little value in distinguishing the two
+>  versions of Arden's lemma, since their proofs and statement are parallel.
+
+We agree that Brzozowski (1964) gives the statement of Arden's lemma, but 
+no proof, for which he refers to Arden (1960). Unfortunately, we do not have 
+access to this book and guess that this applies to many contemporary readers.
+Our reference to Sakarovitch is motivated by the fact that he gives 
+explicitly a proof (XXX CHECK). 
+
+> - I am puzzled by the one-character transition relation between
+>   equivalence classes (p. 6). The $X_i$ correspond to the states of
+>   the minimal automaton that recognizes $L$ (and you even agree with
+>   that in the Related Work section!). Therefore, I disagree with your
+>   sentence stating that this construction does not build an
+>   automaton. 
+
+We deleted this comment since it also confused the second reviewer. 
+What we wanted to express is that we have not defined a transition 
+relation between states in the usual automata sense.
+
+>  Then, it would follow that the initial system would be better
+>  presented as a DFA (that is, associating to each equivalence class,
+>  a mapping from atoms to equivalence classes with the atoms being
+>  disjoint), even if you do not want to introduce the notion of
+>  automaton. I fail to see the advantages of the current presentation
+>  of the initial system. Then, I think that it would be easier to
+>  understand your presentation if you used a more involved language
+>  (like (ab + a)) as a running example. This would produce the
+>  equivalence classes
+>
+>  X1 = {[]}
+>  X2 = {[a]}
+>  X3 = {[ab]}
+>  X4 = {univ - {[], [a], [ab]}}
+>
+>  and the equational system
+>
+>  X1 = LAMBDA(ONE)
+>  X2 = (X1,a)
+>  X3 = (X2,b)
+>  X4 = (X1,b) + (X2,a) + (X3,a) + (X3,b) 
+> 
+
+We have modified the example as you suggest. But note that assuming
+we only have characters a and b in an alphabet, the equation for
+X4 must be
+
+    X4 = (X1,b) + (X2,a) + (X3,a) + (X3,b) + (X4,a) + (X4,b)
+
+
+>  (It could be worth saying that it is a left-linear context free
+>  grammar in any case). Having built this example, I wonder why you
+>  chose to set up the equationnal system the way you did. In the RW
+>  section, you argue that the differences with Brzozowski's are subtle
+>  and impact the presentation of , but I fail to see what would go
+>  wrong if you chose this presentation too. 
+
+The difference is from where we start: In Brzozowski's algebraic 
+method, the start is an automaton from which this method calculates
+a regular expression. Therefore Brzozowski annotates the final
+states with lambdas and `propagates' them to the single initial
+state. From there he can read off a (single) regular expression recognising
+the same language as the automaton. We, on the other hand, start with a 
+set of equivalence classes, for all of which we want to calculate a 
+regular expression. In this way we have to annotate the lambda to the 
+initial state (containing only []) and `propagate' it to all other 
+equivalence classes. Once there, we can read off the regular expressions
+for each equivalence class. There seems to be no convenient way to set 
+up the equational system the original Brzozowski-way. However, it does 
+mean that we have to use the reversed version of Arden's lemma in order to 
+solve our equational systems. 
+
+> - p. 11 "There exists a regular expression" this is not clear, unless
+>   one has a look at your formal proof: it should be made clear that
+>  the definition "regular A" is defined as exists r, A = lang r. 
+
+We clarified this sentence according to your suggestion. 
+
+> - Top of p. 12: it is stated that this algorithm provides a method to
+>  compute a regexp for the complement of a language and that this
+>  construction is similar to the construction of the complement
+>  automaton. I agree with this statement, because I am under the
+>  impression (maybe wrongly) that the construction in this section
+>  boils down to "automata with states indexed by the equivalences
+>  classes" to "regular expression". What I am uncertain of is, under
+>  what conditions I could use your formalization to compute this
+>  regular expression. (Note that I am not speaking about complexity
+>  issues here.)
+
+We are not 100% sure whether we are correctly interpreting your
+question: If you are after an informal "black board" calculation method 
+of this regular expression in front of students, then you can use our
+algorithm. We have done this already. If you are asking whether 
+our formalisation is "constructive" enough in order to directly
+extract an algorithm, then no we cannot do this. We use a theorem
+prover based on classical logic. We, for example, did not make
+explicit how a single regular expression can be constructed from
+a set of regular expressions, still recognising the same language.
+However, we do not like to open the "can-of-worms" about constructive
+and unconstructive arguments and how to extract explicit algorithms.
+In theory we can, in practice we cannot. Unfortunately, we do not have
+a better answer for this at the moment and think this is besides
+the main point of the paper. 
+
+> - It seems that the purpose of the tagging functions here is to build
+>  right invariant equivalence relations that refines the Myhill-Nerode
+>  relation (Lemma 43). Maybe you should start by that. Also, any such
+>  right invariant relation naturally gives rise to an automaton, and
+>  you build such later on. Maybe you could phrase it out at the very
+>  beginning of this section. 
+
+We added a long remark at the end of the section that spells
+out the automata inspiration behind the tagging functions (see p. 17).
+
+> - p. 13 In the definition of the tagging function for the plus case,
+>  isn't there a typo (replacing the \cdot by a comma) ? 
+
+No. Our formalisation uses this definition. 
+
+> - p. 14 There is an "x" missing on the left-hand side of the definition of
+>   x tag A B. 
+
+Yes. Corrected.
+
+> - The structure of section 4 is a bit difficult to follow, because the
+>   proof of Theorem 31 is interleaved with the statement of many
+>   intermediate lemmas. Could you separate the definition of the
+>   tagging functions from the proof? 
+
+We have added a figure at the beginning of the proof and included
+appropriate references.
+
+>   I think it would also be
+>   interesting to point out that the type of the codomain the tag
+>   functions are not the same through the section.  Again, I am under
+>   the impression that the tagging functions defined in this section
+>   define states of an automaton that is not built explicitly, with the
+>   result of the tag function being a particular state in a suitably
+>   built automaton.  
+
+We added a relatively lengthy remark after the proof about 
+how the tagging functions can be seen as constructing particular
+automata - see p. 17.
+
+
+>   Yet, it could be worth pointing out that this use
+>   of polymorphism for the tag functions does not have the same
+>   problems as the one encountered when one try to formalize
+>   automata. This is in a sense contradiction with your statement
+>   p. 24 that you sidestep the composition of automata (this automaton
+>   is not explicitly built). 
+
+We have modified the statement on p. 24.
+
+> - p. 18: you mention the fact that it is painful to formalize
+>   Brzozowski's argument stating that there are only finitely many
+>   dissimilar derivatives. Is this because it requires to build part of
+>   the equationnal theory of regular expressions that you avoided so
+>   far? Could you comment on what Coquand and Siles did in the paper
+>   you cite?  
+
+Coquand and Siles first spend 3 pages of their 16-pages paper
+developing a theory of inductively finite sets in type theory.
+This theory uses "discrete sets" where the equality needs to 
+be decidable. They then view the ACI-properties as rewrite rules 
+leading to a decidable equality over regular expressions. This 
+shows (although they left out the details in their paper) that the 
+set of derivatives is inductively finite. 
+
+In our opinion, Coquand and Siles' development strongly justifies our 
+comment about formalisation of Brzozowski's argument being painful.
+If you like to add this as a reference in the paper, we are happy
+to do so. 
+
+> - Section 6, p20, previous to last paragraph: you say that the
+>   construction of the complement regular expression involves several
+>   transformations: you could shave off the construction of the
+>   non-deterministic automaton, since it can be avoided using
+>   Antimirov's construction of a DFA (as you do).
+
+We feel that our sequence adequately represents procedure
+that is used by informed outsiders and that is usually 
+taught to students. Antimirov's construction using partial
+derivatives is unfortunately not widely known, which we hope
+we can help to change with this paper.
+
+> - (27) are properties, not definitions, isn't it? 
+
+Corrected. 
+
+> - A question. On the one hand, Isabelle/HOL features a classical
+>   logic, with the law of the excluded middle and the Hilbert's epsilon
+>   operator. On the other hand, in Coq, it is often involved to prove
+>   decidability results, which may make it more difficult to formalize
+>   some of the arguments you are using. Is it possible in Isabelle/HOL
+>   to track what are the uses of classical logic that you use? 
+
+In theory there is a way to track all classical uses in Isabelle,
+but this can be practically unfeasible, since proof terms can be 
+gigantic (therefore the Isabelle/HOL follows the LCF approach 
+where only theorems need to be recorded for correctness). Having 
+said that there is one obvious place in our proof where we used 
+a "classical" shortcut by using + for generating a single regular 
+expression from a finite set of expressions. This would need better 
+"explanation" in a constructive proof. There are no other inherent
+classical shortcuts in the proof. But as said before opening we
+think opening this can-of-worms is beside the focus of the paper.
+
+
+>  In particular, I am worried about the last sentence on page 24: how do
+>  you make these sets computational? 
+
+We wrote:
+
+ However, since partial derivatives use sets of regular expressions, one 
+ needs to carefully analyse whether the resulting algorithm is still 
+ executable. Given the infrastructure for executable sets introduced by 
+ Haftmann (2009) in Isabelle/HOL, it should.
+
+Does this allay your fears?
+
+> - I think a word of caution should be added to the comparison with
+>   related works in terms of formalization size: there is a huge gap in
+>   the scope of these formalizations efforts, especially when one
+>   defines algorithms. For instance, an usual corollary of the
+>   Myhill-Nerode theorem is an algorithm that minimize automata: could
+>   you provide, for instance, an algorithm that compute the number of
+>   equivalences classes of your Myhill-Nerode relation?  
+
+We have attempted to be very careful with our comments about the
+formalisation sizes from other works. Our intention is only to draw
+attention to the somewhat surprising point, namely that formalisation
+using automata (either out of choice or being forced) led to 
+formalisations where the people in question had to fight their
+theorem prover. On the other hand, our formalisation could rely
+on the fact that regular expressions are inductive datatypes which
+are generally well-supported by theorem prover reasoning. We hope
+to have "balanced the picture" by saying that regular expressions
+are not good for everything (see paragraph in the Conclusion
+"While regular expressions are convenient, they have some 
+limitations...").
+
+We have not thought about an algorithm to compute the number of
+equivalence classes. 
+
+> - There has been several work that uses matrices in Coq, in the
+>   context of the ssreflect dialect. For instance, you may have a look
+>   at Dénès M, Mörtberg A, Siles V.  2012.  A refinement-based approach
+>   to computational algebra in COQ. ITP 2012. It does not seem that
+>   there is a particular problem with the use of matrices. 
+
+We relied completely on the comment by (expert formaliser in Coq) Braibant 
+from his thesis which explicitly states 
+
+  | Note that while they [Wu, Zhang, Urban] justify their work by the fact  
+  | that formalising automata ``can be a real hassle in HOL-based provers'', 
+  | and that neither Isabelle/HOL nor HOL4 nor HOL-light support matrices and 
+  | graphs with libraries, we did not encounter such difficulties in Coq 
+  | (omitting the **intrinsic difficulty** of working with **rectangular matrices** 
+  | at some points).
+
+The emphasis added by us. As far as we know Braibant also uses ssreflect.
+
+
+
+Reviewer #2
+=========== 
+
+> This is a very nice paper, at least in terms of its content, and it
+> should certainly be published in some shape or form.  It's based on an
+> ITP conference paper from 2011, but seems to me to include sufficient
+> new content to justify republication in a journal.  To be specific,
+> the new material is: the second of the two proofs showing the
+> implication that if a language is regular it has a finite number of
+> Myhill-Nerode equivalence classes (using partial derivatives), and all
+> of the closure properties.  Both new contributions are very appealing.
+
+We agree with your observations.
+
+> Really, I have only a slew of minor criticisms about the paper, and
+> these almost all have to do with minor presentation issues.  I started
+> to list some of these in the "Minor things" section below, but I
+> eventually decided that I should simply say that 1. the English is
+> sometimes less than perfect; 2. a spell-checker should be used; 3. the
+> typography is occasionally dubious; and that 4. it all needs to be
+> re-read with more care.
+
+We have improved the paper. Please let us know if we
+overlooked any more errors.
+
+> Finally, the conclusion and related work sections are reasonable.  I
+> think the conclusion could usefully discuss the executability of the
+> algorithm in the first half of the M-N proof.  
+
+Informally, the algorithm is clearly executable. But we have not 
+attempted this with our formalisation. 
+
+> Also, the conclusion discusses the Brzozowski's algebraic method in 
+> this context, and argues that the equational constraints that appear 
+> in this paper are not quite the same as a minimal automaton. It's 
+> clear that, yes, they are a little different, but it still seems to 
+> me that the initial state of the constraint set is isomorphic to the 
+> automaton.  Given that, I was very surprised to read (on p6) that 
+> "Note that we do not define an automaton here".  
+
+Yes, we agree, behind our equational system lurks a minimal automaton. What 
+we wanted to clear up is that we have not defined a transition relation 
+between states in the usual automata sense. Despite best intentions, we 
+have deleted this remark, as it clearly misled the reviewers.
+
+> Appendix: this seems unnecessary. If it's interesting enough to be in
+> the main text, it should be there. Otherwise refer the reader to the
+> online formalisation. In this case, the appendix is so small that it
+> seems particularly silly to have one.
+
+We deleted the appendix.
\ No newline at end of file
diff -r c028b07a4fa8 -r 99161cd17c0f Journal/Paper.thy
--- a/Journal/Paper.thy	Wed Jul 10 13:28:24 2013 +0100
+++ b/Journal/Paper.thy	Wed Jul 10 17:47:30 2013 +0100
@@ -563,7 +563,7 @@
 
   Central to our proof will be the solution of equational systems
   involving equivalence classes of languages. For this we will use Arden's Lemma 
-  (see for example \cite[Page 100]{Sakarovitch09}),
+  (see for example \citet[Page 100]{Sakarovitch09}),
   which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
   @{term "[] \<notin> A"}. However we will need the following `reversed' 
   version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \<cdot> X"} to
@@ -659,16 +659,18 @@
   It is easy to see that @{term "\<approx>A"} is an equivalence relation, which
   partitions the set of all strings, @{text "UNIV"}, into a set of disjoint
   equivalence classes. To illustrate this quotient construction, let us give a simple 
-  example: consider the regular language containing just
-  the string @{text "[c]"}. The relation @{term "\<approx>({[c]})"} partitions @{text UNIV}
-  into three equivalence classes @{text "X\<^isub>1"}, @{text "X\<^isub>2"} and  @{text "X\<^isub>3"}
+  example: consider the regular language built up over the alphabet @{term "{a, b}"} and 
+  containing just the string two strings @{text "[a]"} and @{text "[a, b]"}. The 
+  relation @{term "\<approx>({[a], [a, b]})"} partitions @{text UNIV}
+  into four equivalence classes @{text "X\<^isub>1"}, @{text "X\<^isub>2"}, @{text "X\<^isub>3"} and @{text "X\<^isub>4"}
   as follows
   
   \begin{center}
   \begin{tabular}{l}
   @{text "X\<^isub>1 = {[]}"}\\
-  @{text "X\<^isub>2 = {[c]}"}\\
-  @{text "X\<^isub>3 = UNIV - {[], [c]}"}
+  @{text "X\<^isub>2 = {[a]}"}\\
+  @{text "X\<^isub>3 = {[a, b]}"}\\
+  @{text "X\<^isub>4 = UNIV - {[], [a], [a, b]}"}
   \end{tabular}
   \end{center}
 
@@ -689,8 +691,8 @@
   \end{equation}
 
   \noindent
-  In our running example, @{text "X\<^isub>2"} is the only 
-  equivalence class in @{term "finals {[c]}"}.
+  In our running example, @{text "X\<^isub>2"} and @{text "X\<^isub>3"} are the only 
+  equivalence classes in @{term "finals {[a], [a, b]}"}.
   It is straightforward to show that in general 
   %
   \begin{equation}\label{finalprops}
@@ -723,10 +725,16 @@
   %Note that we do not define formally an automaton here, 
   %we merely relate two sets
   %(with the help of a character). 
-  In our concrete example we have 
-  @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<^isub>i\<Rightarrow> X\<^isub>3"} with @{text "d\<^isub>i"} being any 
-  other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>c\<^isub>j\<Rightarrow> X\<^isub>3"} for any 
-  character @{text "c\<^isub>j"}.
+  In our concrete example we have
+
+  \begin{center} 
+  \begin{tabular}{l}
+  @{term "X\<^isub>1 \<Turnstile>a\<Rightarrow> X\<^isub>2"},\; @{term "X\<^isub>1 \<Turnstile>b\<Rightarrow> X\<^isub>4"};\\ 
+  @{term "X\<^isub>2 \<Turnstile>b\<Rightarrow> X\<^isub>3"},\; @{term "X\<^isub>2 \<Turnstile>a\<Rightarrow> X\<^isub>4"};\\
+  @{term "X\<^isub>3 \<Turnstile>a\<Rightarrow> X\<^isub>4"},\; @{term "X\<^isub>3 \<Turnstile>b\<Rightarrow> X\<^isub>4"} and\\ 
+  @{term "X\<^isub>4 \<Turnstile>a\<Rightarrow> X\<^isub>4"},\; @{term "X\<^isub>4 \<Turnstile>b\<Rightarrow> X\<^isub>4"}.
+  \end{tabular}
+  \end{center}
   
   Next we construct an \emph{initial equational system} that
   contains an equation for each equivalence class. We first give
@@ -773,17 +781,14 @@
   \begin{equation}\label{exmpcs}
   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
   @{term "X\<^isub>1"} & @{text "="} & @{text "\<lambda>(ONE)"}\\
-  @{term "X\<^isub>2"} & @{text "="} & @{text "(X\<^isub>1, ATOM c)"}\\
-  @{term "X\<^isub>3"} & @{text "="} & @{text "(X\<^isub>1, ATOM d\<^isub>1) + \<dots> + (X\<^isub>1, ATOM d\<^isub>n)"}\\
-               & & \mbox{}\hspace{10mm}@{text "+ (X\<^isub>3, ATOM c\<^isub>1) + \<dots> + (X\<^isub>3, ATOM c\<^isub>m)"}
+  @{term "X\<^isub>2"} & @{text "="} & @{text "(X\<^isub>1, ATOM a)"}\\
+  @{term "X\<^isub>3"} & @{text "="} & @{text "(X\<^isub>2, ATOM b)"}\\
+  @{term "X\<^isub>4"} & @{text "="} & @{text "(X\<^isub>1, ATOM b) + (X\<^isub>2, ATOM a) + (X\<^isub>3, ATOM a)"}\\
+          & & \mbox{}\hspace{0mm}@{text "+ (X\<^isub>3, ATOM b) + (X\<^isub>4, ATOM a) + (X\<^isub>4, ATOM b)"}\\
   \end{tabular}}
   \end{equation}
-  
+
   \noindent
-  where @{text "d\<^isub>1\<dots>d\<^isub>n"} is the sequence of all characters
-  but not containing @{text c}, and @{text "c\<^isub>1\<dots>c\<^isub>m"} is the sequence of all
-  characters.  
-
   Overloading the function @{text \<calL>} for the two kinds of terms in the
   equational system, we have
   
@@ -883,26 +888,23 @@
   then we calculate the combined regular expressions for all @{text r} coming 
   from the deleted @{text "(X, r)"}, and take the @{const Star} of it;
   finally we append this regular expression to @{text rhs'}. If we apply this
-  operation to the right-hand side of @{text "X\<^isub>3"} in \eqref{exmpcs}, we obtain
+  operation to the right-hand side of @{text "X\<^isub>4"} in \eqref{exmpcs}, we obtain
   the equation:
 
   \begin{center}
   \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
-  @{term "X\<^isub>3"} & @{text "="} & 
-    @{text "(X\<^isub>1, TIMES (ATOM d\<^isub>1) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\<dots>, ATOM c\<^isub>m})) + \<dots> "}\\
-  & & \mbox{}\hspace{13mm}
-     @{text "\<dots> + (X\<^isub>1, TIMES (ATOM d\<^isub>n) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\<dots>, ATOM c\<^isub>m}))"}
+  @{term "X\<^isub>4"} & @{text "="} & 
+    @{text "(X\<^isub>1, TIMES (ATOM b) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b})) +"}\\
+  & & @{text "(X\<^isub>2, TIMES (ATOM a) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b})) +"}\\
+  & & @{text "(X\<^isub>3, TIMES (ATOM a) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b})) +"}\\
+  & & @{text "(X\<^isub>3, TIMES (ATOM b) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM a, ATOM b}))"}
   \end{tabular}
   \end{center}
 
 
   \noindent
-  That means we eliminated the recursive occurrence of @{text "X\<^isub>3"} on the
-  right-hand side.  Note we used the abbreviation 
-  @{text "\<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\<dots>, ATOM c\<^isub>m}"} 
-  to stand for a regular expression that matches with every character. In 
-  our algorithm we are only interested in the existence of such a regular expression
-  and do not specify it any further. 
+  That means we eliminated the recursive occurrence of @{text "X\<^isub>4"} on the
+  right-hand side. 
 
   It can be easily seen that the @{text "Arden"}-operation mimics Arden's
   Lemma on the level of equations. To ensure the non-emptiness condition of
@@ -951,7 +953,7 @@
   %
   \begin{equation}\label{exmpresult}
   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
-  @{term "X\<^isub>2"} & @{text "="} & @{text "\<lambda>(TIMES ONE (ATOM c))"}
+  @{term "X\<^isub>2"} & @{text "="} & @{text "\<lambda>(TIMES ONE (ATOM a))"}
   \end{tabular}}
   \end{equation}
 
@@ -1194,7 +1196,15 @@
 
   \begin{proof}[of Theorem~\ref{myhillnerodeone}]
   By Lemma~\ref{every_eqcl_has_reg} we know that there exists a regular expression for
-  every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is
+  every equivalence class in @{term "UNIV // \<approx>A"}:
+
+  \[
+  \mbox{if}\;@{term "X \<in> (UNIV // \<approx>A)"}\;\mbox{then there exists a}\; 
+  @{text "r"}\;\mbox{such that}\;@{term "X = lang r"}
+  \]\smallskip
+
+  \noindent
+  Since @{text "finals A"} is
   a subset of  @{term "UNIV // \<approx>A"}, we also know that for every equivalence class
   in @{term "finals A"} there exists a regular expression. Moreover by assumption 
   we know that @{term "finals A"} must be finite, and therefore there must be a finite
@@ -1205,12 +1215,27 @@
   \end{proof}
 
   \noindent
+  Solving the equational system of our running example gives as solution for the 
+  two final equivalence classes:
+
+  \begin{center}
+  \begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
+  @{text "X\<^isub>2"} & @{text "="} & @{text "TIMES ONE (ATOM a)"}\\
+  @{text "X\<^isub>3"} & @{text "="} & @{text "TIMES (TIMES ONE (ATOM a)) (ATOM b)"}
+  \end{tabular}
+  \end{center}
+
+  \noindent
+  Combining them with $\bigplus$ gives us a regular expression for the language @{text "{[a], [a, b]}"}.
+
   Note that our algorithm for solving equational systems provides also a method for
   calculating a regular expression for the complement of a regular language:
   if we combine all regular
-  expressions corresponding to equivalence classes not in @{term "finals A"},
+  expressions corresponding to equivalence classes not in @{term "finals A"} 
+  (in the running example @{text "X\<^isub>1"} and @{text "X\<^isub>4"}),
   then we obtain a regular expression for the complement language @{term "- A"}.
   This is similar to the usual construction of a `complement automaton'.
+
 *}
 
 section {* Myhill-Nerode, Second Part *}
@@ -1296,10 +1321,13 @@
 
   \noindent
   For constructing @{text R}, we will rely on some \emph{tagging-functions}
-  defined over strings. Given the inductive hypothesis, it will be easy to
+  defined over strings, see Fig.~\ref{tagfig}. They are parameterised by sets
+  of strings @{text A} and @{text B} standing for the induction hypotheses.
+  Given the inductive hypotheses, it will be easy to
   prove that the \emph{range} of these tagging-functions is finite. The range
   of a function @{text f} is defined as
 
+ 
   \begin{center}
   @{text "range f \<equiv> f ` UNIV"}
   \end{center}
@@ -1334,6 +1362,24 @@
   is finite, then the set @{text A} itself must be finite. We can use it to establish the following 
   two lemmas.
 
+  \begin{figure}[t]
+  \normalsize
+  \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
+  @{thm (lhs) tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]} &
+  @{text "\<equiv>"} &
+  @{thm (rhs) tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]} \\
+  @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}\;@{text "x"} & @{text "\<equiv>"} &
+  @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"}\\
+  @{thm (lhs) tag_Star_def[where ?A="A", THEN meta_eq_app]} & @{text "\<equiv>"} &
+  @{text "{\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>A\<^esub> | x\<^isub>p < x \<and> x\<^isub>p \<in> A\<^isup>\<star> \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x}"}
+  \end{tabular}
+  \caption{Three tagging functions used the cases for @{term "PLUS"}, @{term "TIMES"} and @{term "STAR"} 
+  regular expressions. The sets @{text A} and @{text B} are arbitrary languages instantiated
+  in the proof with the induction hypotheses. \emph{Partitions} is a function
+  that generates all possible partitions of a string.\label{tagfig}}
+  \end{figure}
+
+
   \begin{lemma}\label{finone}
   @{thm[mode=IfThen] finite_eq_tag_rel}
   \end{lemma}
@@ -1379,7 +1425,7 @@
   Chaining Lemma~\ref{finone} and \ref{fintwo} together, means in order to show
   that @{term "UNIV // \<approx>(lang r)"} is finite, we have to construct a tagging-function whose
   range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(lang r)"}.
-  Let us attempt the @{const PLUS}-case first. We take as tagging-function 
+  Let us attempt the @{const PLUS}-case first. We take as tagging-function from Fig.~\ref{tagfig}
  
   \begin{center}
   @{thm tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]}
@@ -1510,7 +1556,7 @@
   tagging-function in the @{const Times}-case as:
 
   \begin{center}
-  @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\<equiv>"}~
+  @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}\;@{text "x"}~@{text "\<equiv>"}~
   @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"}
   \end{center}
 
@@ -1679,6 +1725,58 @@
   A\<star>"}, which means @{term "y @ z \<in> A\<star>"}. The last step is to set
   @{text "A"} to @{term "lang r"} and thus complete the proof.
   \end{proof}
+
+  \begin{rmk}
+  While our proof using tagging functions might seem like a rabbit pulled 
+  out of a hat, the intuition behind can be rationalised taking the 
+  view that the equivalence classes @{term "UNIV // \<approx>(lang r)"} stand for the 
+  states of the minimal automaton for the language @{term "lang r"}. The theorem 
+  amounts to showing that this minimal automaton has finitely many states. 
+  However, by showing that our @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} relation 
+  refines @{term "\<approx>A"} we do not actually have to show that the minimal automata
+  has finitely many states, but only that we can show finiteness for an 
+  automaton with at least as many states and which can recognise the same 
+  language. By performing the induction over the regular expression @{text r},
+  this means we have to construct inductively an automaton in
+  the cases for @{term PLUS}, @{term TIMES} and @{term STAR}.
+
+  In the @{text PLUS}-case, we can assume finitely many equivalence classes of the form
+  @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"} and @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>B\<^esub>"}, each standing for an automaton recognising the 
+  languages @{text A} and @{text B} respectively. 
+  The @{text "+tag"} function constructs pairs of the form @{text "(\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>, \<lbrakk>_\<rbrakk>\<^bsub>\<approx>B\<^esub>)"} 
+  which can be seen as the states of an automaton recognising the language 
+  \mbox{@{term "A \<union> B"}}. This is the usual product construction of automata: 
+  Given a string @{text x}, the first automata will be in the state  @{text "\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>"}
+  after recognising @{text x} (similarly @{text "\<lbrakk>x\<rbrakk>\<^bsub>\<approx>B\<^esub>"} for the other automaton). The product
+  automaton will be in the state \mbox{@{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, \<lbrakk>x\<rbrakk>\<^bsub>\<approx>B\<^esub>)"}}, which is accepting
+  if at least one component is an accepting state.
+
+  The @{text "TIMES"}-case is a bit more complicated---essentially we 
+  need to sequentially compose the two automata with the states @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"}, 
+  respectively @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>B\<^esub>"}. We achieve this sequential composition by
+  taking the first automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"} and append on each of its accepting
+  state the automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>B\<^esub>"}. Unfortunately, this does not lead directly
+  to a correct composition, since the automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"} might have consumed
+  some of the input needed for the automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>B\<^esub>"} to succeed. The 
+  textbook construction for sequentially composing two automata circumvents this
+  problem by connecting the terminating states of the first automaton via
+  an epsilon-transition to the initial state of the second automaton (see for
+  example \citeN{Kozen97}). In the absence of any epsilon-transition analogue in our work, 
+  we let the second automaton
+  start in all possible states that may have been reached if the first
+  automaton had not consumed the input meant for the second. We achieve this
+  by having a set of states as the second component generated by our 
+  @{text "\<times>tag"} function (see second clause in Fig.~\ref{tagfig}).
+  A state of this sequentially composed automaton is accepting, if the first 
+  component is accepting and at least one state in the set is also accepting.
+
+  The idea behind the @{text "STAR"}-case is similar to the @{text "TIMES"}-case.
+  We assume some automaton has consumed some strictly smaller part of the input;
+  we need to check that from the state we ended up in a terminal state in the 
+  automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"}. Since we do not know from which state this will 
+  succeed, we need to run the automaton from all possible states we could have 
+  ended up in. Therefore the @{text "\<star>tag"} function generates a set of states.
+  \end{rmk}
 *}
 
 section {* Second Part proved using Partial Derivatives *}
@@ -2211,12 +2309,12 @@
   %
   \begin{equation}\label{supseqprops}
   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
-  @{thm (lhs) SUPSEQ_simps(1)} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_simps(1)}\\  
-  @{thm (lhs) SUPSEQ_simps(2)} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_simps(2)}\\ 
-  @{thm (lhs) SUPSEQ_atom} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_atom}\\ 
-  @{thm (lhs) SUPSEQ_union} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_union}\\
-  @{thm (lhs) SUPSEQ_conc} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_conc}\\
-  @{thm (lhs) SUPSEQ_star} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_star}
+  @{thm (lhs) SUPSEQ_simps(1)} & @{text "="} & @{thm (rhs) SUPSEQ_simps(1)}\\  
+  @{thm (lhs) SUPSEQ_simps(2)} & @{text "="} & @{thm (rhs) SUPSEQ_simps(2)}\\ 
+  @{thm (lhs) SUPSEQ_atom} & @{text "="} & @{thm (rhs) SUPSEQ_atom}\\ 
+  @{thm (lhs) SUPSEQ_union} & @{text "="} & @{thm (rhs) SUPSEQ_union}\\
+  @{thm (lhs) SUPSEQ_conc} & @{text "="} & @{thm (rhs) SUPSEQ_conc}\\
+  @{thm (lhs) SUPSEQ_star} & @{text "="} & @{thm (rhs) SUPSEQ_star}
   \end{tabular}}
   \end{equation}
 
@@ -2365,24 +2463,9 @@
   exists a regular expression that matches all of its strings. Regular
   expressions can conveniently be defined as a datatype in theorem
   provers. For us it was therefore interesting to find out how far we can push
-  this point of view. But this question whether formal language theory can
-  be done without automata crops up also in non-theorem prover contexts. For 
-  example, Gasarch asked in the Computational Complexity blog by \citeN{GasarchBlog} 
-  whether the complementation of 
-  regular-expression languages can be proved without using automata. He concluded
- 
-  \begin{quote}
-  {\it \ldots it can't be done}
-  \end{quote} 
-
-  \noindent
-  and even pondered
-
-  \begin{quote}
-  {\it \ldots [b]ut is there a rigorous way to even state that?} 
-  \end{quote} 
-
-  \noindent
+  this point of view. But this question whether regular language theory can
+  be done without automata crops up also in non-theorem prover contexts. Recall 
+  Gasarch's speculation cited in the Introduction. 
   We have established in Isabelle/HOL both directions 
   of the Myhill-Nerode Theorem.
   %
@@ -2398,7 +2481,7 @@
   Pumping Lemma).  We can also use it to establish the standard
   textbook results about closure properties of regular languages. The case of 
   closure under complement follows easily from the Myhill-Nerode Theorem.
-  So our answer to Gasarch is ``{\it it can be done!''}  
+  So our answer to Gasarch is ``{\it yes we can!''}  
 
   %Our insistence on regular expressions for proving the Myhill-Nerode Theorem
   %arose from the problem of defining formally the regularity of a language.
@@ -2465,15 +2548,15 @@
   in the `pencil-and-paper-reasoning' literature about our way of proving the 
   first direction of the Myhill-Nerode Theorem, but it appears to be folklore.
 
-  We presented two proofs for the second direction of the Myhill-Nerode
-  Theorem. One direct proof using tagging-functions and another using partial
-  derivatives. This part of our work is where our method using regular
-  expressions shines, because we can completely side-step the standard
-  argument (for example used by \citeN{Kozen97}) where automata need to be composed. However, it is
-  also the direction where we had to spend most of the `conceptual' time, as
-  our first proof based on tagging-functions is new for establishing the
-  Myhill-Nerode Theorem. All standard proofs of this direction proceed by
-  arguments over automata.
+  We presented two proofs for the second direction of the
+  Myhill-Nerode Theorem. One direct proof using tagging-functions and
+  another using partial derivatives. This part of our work is where
+  our method using regular expressions shines, because we can perform
+  an induction on the structure of refular expressions. However, it is
+  also the direction where we had to spend most of the `conceptual'
+  time, as our first proof based on tagging-functions is new for
+  establishing the Myhill-Nerode Theorem. All standard proofs of this
+  direction proceed by arguments over automata.
   
   The indirect proof for the second direction arose from our interest in
   Brzozowski's derivatives for regular expression matching.  While \citeN{Brzozowski64} 
diff -r c028b07a4fa8 -r 99161cd17c0f Journal/document/root.tex
--- a/Journal/document/root.tex	Wed Jul 10 13:28:24 2013 +0100
+++ b/Journal/document/root.tex	Wed Jul 10 17:47:30 2013 +0100
@@ -42,7 +42,7 @@
 
 \newcommand{\bigplus}{\mbox{\Large\bf$+$}}
 
-%\spnewtheorem{thrm}{Theorem}[section]{\bf}{\it}
+\spnewtheorem*{rmk}{Remark}{\bf}{}
 %\spnewtheorem{lmm}{Lemma}[section]{\bf}{\it}
 %\spnewtheorem{dfntn}{Definition}[section]{\bf}{\it}
 %\spnewtheorem{prpstn}{Proposition}[section]{\bf}{\it}
diff -r c028b07a4fa8 -r 99161cd17c0f journal.pdf
Binary file journal.pdf has changed