diff -r cf1c17431dab -r b300f2c5d51d Journal/Paper.thy --- a/Journal/Paper.thy Mon Aug 15 22:36:26 2011 +0000 +++ b/Journal/Paper.thy Tue Aug 16 10:21:14 2011 +0000 @@ -74,9 +74,9 @@ Der ("Der _ _" [1000, 1000] 100) and ONE ("ONE" 999) and ZERO ("ZERO" 999) and - pders_lang ("pderl") and + pders_lang ("pdersl") and UNIV1 ("UNIV\<^isup>+") and - Ders_lang ("Derl") + Ders_lang ("Dersl") lemma meta_eq_app: shows "f \ \x. g x \ f x \ g x" @@ -159,7 +159,8 @@ The typical approach to regular languages is to introduce finite automata and then define everything in terms of them - \cite{Kozen97}. For example, a regular language is normally defined as: + \cite{ HopcroftUllman69,Kozen97}. For example, a regular language is + normally defined as: \begin{dfntn}\label{baddef} A language @{text A} is \emph{regular}, provided there is a @@ -251,22 +252,21 @@ \noindent changes the type---the disjoint union is not a set, but a set of pairs. Using this definition for disjoint union means we do not have a - single type for automata. As a result we will not be able to define a + single type for the states of automata. As a result we will not be able to define a regular language as one for which there exists an automaton that recognises all its strings (Definition~\ref{baddef}). This is because we cannot make a - definition in HOL that is polymorphic in the state type and there is no type + definition in HOL that is only polymorphic in the state type and there is no type quantification available in HOL (unlike in Coq, for example).\footnote{Slind already pointed out this problem in an email to the HOL4 mailing list on 21st April 2005.} - - An alternative, which provides us with a single type for automata, is to give every - state node an identity, for example a natural - number, and then be careful to rename these identities apart whenever - connecting two automata. This results in clunky proofs - establishing that properties are invariant under renaming. Similarly, - connecting two automata represented as matrices results in very adhoc - constructions, which are not pleasant to reason about. + An alternative, which provides us with a single type for states of automata, + is to give every state node an identity, for example a natural number, and + then be careful to rename these identities apart whenever connecting two + automata. This results in clunky proofs establishing that properties are + invariant under renaming. Similarly, connecting two automata represented as + matrices results in very adhoc constructions, which are not pleasant to + reason about. Functions are much better supported in Isabelle/HOL, but they still lead to similar problems as with graphs. Composing, for example, two non-deterministic automata in parallel @@ -434,7 +434,7 @@ Central to our proof will be the solution of equational systems involving equivalence classes of languages. For this we will use Arden's Lemma - (see \cite[Page 100]{Sakarovitch09}), + (see for example \cite[Page 100]{Sakarovitch09}), which solves equations of the form @{term "X = A \ X \ B"} provided @{term "[] \ A"}. However we will need the following `reverse' version of Arden's Lemma (`reverse' in the sense of changing the order of @{term "A \ X"} to @@ -448,7 +448,7 @@ \begin{proof} For the right-to-left direction we assume @{thm (rhs) arden} and show - that @{thm (lhs) arden} holds. From Prop.~\ref{langprops}@{text "(i)"} + that @{thm (lhs) arden} holds. From Property~\ref{langprops}@{text "(i)"} we have @{term "A\ = A \ A\ \ {[]}"}, which is equal to @{term "A\ = A\ \ A \ {[]}"}. Adding @{text B} to both sides gives @{term "B \ A\ = B \ (A\ \ A \ {[]})"}, whose right-hand side @@ -467,12 +467,12 @@ of @{text "\"}. For the inclusion in the other direction we assume a string @{text s} with length @{text k} is an element in @{text X}. Since @{thm (prem 1) arden} - we know by Prop.~\ref{langprops}@{text "(ii)"} that + we know by Property~\ref{langprops}@{text "(ii)"} that @{term "s \ X \ (A \ Suc k)"} since its length is only @{text k} (the strings in @{term "X \ (A \ Suc k)"} are all longer). From @{text "(*)"} it follows then that @{term s} must be an element in @{term "(\m\{0..k}. B \ (A \ m))"}. This in turn - implies that @{term s} is in @{term "(\n. B \ (A \ n))"}. Using Prop.~\ref{langprops}@{text "(iii)"} + implies that @{term s} is in @{term "(\n. B \ (A \ n))"}. Using Property~\ref{langprops}@{text "(iii)"} this is equal to @{term "B \ A\"}, as we needed to show. \end{proof} @@ -601,7 +601,7 @@ that matches every string in @{text A}. - Our proof of Thm.~\ref{myhillnerodeone} relies on a method that can calculate a + Our proof of Theorem~\ref{myhillnerodeone} relies on a method that can calculate a regular expression for \emph{every} equivalence class, not just the ones in @{term "finals A"}. We first define the notion of \emph{one-character-transition} between @@ -733,9 +733,9 @@ \end{lmm} \noindent - Our proof of Thm.~\ref{myhillnerodeone} will proceed by transforming the + Our proof of Theorem~\ref{myhillnerodeone} will proceed by transforming the initial equational system into one in \emph{solved form} maintaining the invariant - in Lem.~\ref{inv}. From the solved form we will be able to read + in Lemma~\ref{inv}. From the solved form we will be able to read off the regular expressions. In order to transform an equational system into solved form, we have two @@ -891,16 +891,15 @@ \end{center} \noindent - We are not concerned here with the definition of this operator - (see Berghofer and Nipkow \cite{BerghoferNipkow00}), but note that we eliminate - in each @{const Iter}-step a single equation, and therefore - have a well-founded termination order by taking the cardinality - of the equational system @{text ES}. This enables us to prove - properties about our definition of @{const Solve} when we `call' it with - the equivalence class @{text X} and the initial equational system - @{term "Init (UNIV // \A)"} from + We are not concerned here with the definition of this operator (see + Berghofer and Nipkow \cite{BerghoferNipkow00} for example), but note that we + eliminate in each @{const Iter}-step a single equation, and therefore have a + well-founded termination order by taking the cardinality of the equational + system @{text ES}. This enables us to prove properties about our definition + of @{const Solve} when we `call' it with the equivalence class @{text X} and + the initial equational system @{term "Init (UNIV // \A)"} from \eqref{initcs} using the principle: - % + \begin{equation}\label{whileprinciple} \mbox{\begin{tabular}{l} @{term "invariant (Init (UNIV // \A))"} \\ @@ -964,7 +963,7 @@ \begin{proof} Finiteness is given by the assumption and the way how we set up the - initial equational system. Soundness is proved in Lem.~\ref{inv}. Distinctness + initial equational system. Soundness is proved in Lemma~\ref{inv}. Distinctness follows from the fact that the equivalence classes are disjoint. The @{text ardenable} property also follows from the setup of the initial equational system, as does validity. @@ -993,7 +992,7 @@ \noindent Finiteness is straightforward, as the @{const Subst} and @{const Arden} operations keep the equational system finite. These operations also preserve soundness - and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}). + and distinctness (we proved soundness for @{const Arden} in Lemma~\ref{ardenable}). The property @{text ardenable} is clearly preserved because the append-operation cannot make a regular expression to match the empty string. Validity is given because @{const Arden} removes an equivalence class from @{text yrhs} @@ -1032,16 +1031,16 @@ \begin{proof} In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly - stronger invariant since Lem.~\ref{iterone} and \ref{itertwo} have the precondition + stronger invariant since Lemma~\ref{iterone} and \ref{itertwo} have the precondition that @{term "(X, rhs) \ ES"} for some @{text rhs}. This precondition is needed in order to choose in the @{const Iter}-step an equation that is not \mbox{@{term "X = rhs"}}. Therefore our invariant cannot be just @{term "invariant ES"}, but must be @{term "invariant ES \ (\rhs. (X, rhs) \ ES)"}. By assumption - @{thm (prem 2) Solve} and Lem.~\ref{invzero}, the more general invariant holds for + @{thm (prem 2) Solve} and Lemma~\ref{invzero}, the more general invariant holds for the initial equational system. This is premise 1 of~\eqref{whileprinciple}. - Premise 2 is given by Lem.~\ref{iterone} and the fact that @{const Iter} might + Premise 2 is given by Lemma~\ref{iterone} and the fact that @{const Iter} might modify the @{text rhs} in the equation @{term "X = rhs"}, but does not remove it. - Premise 3 of~\eqref{whileprinciple} is by Lem.~\ref{itertwo}. Now in premise 4 + Premise 3 of~\eqref{whileprinciple} is by Lemma~\ref{itertwo}. Now in premise 4 we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"} and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"} does not holds. By the stronger invariant we know there exists such a @{text "rhs"} @@ -1066,7 +1065,7 @@ and that the invariant holds for this equation. That means we know @{text "X = \\ ` rhs"}. We further know that this is equal to \mbox{@{text "\\ ` (Arden X rhs)"}} using the properties of the - invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"}, + invariant and Lemma~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"}, we can infer that @{term "rhss rhs \ {X}"} and because the @{text Arden} operation removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}. This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}. @@ -1075,11 +1074,11 @@ \end{proof} \noindent - Lem.~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction + Lemma~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction of the Myhill-Nerode theorem. - \begin{proof}[of Thm.~\ref{myhillnerodeone}] - By Lem.~\ref{every_eqcl_has_reg} we know that there exists a regular expression for + \begin{proof}[Proof of Theorem~\ref{myhillnerodeone}] + By Lemma~\ref{every_eqcl_has_reg} we know that there exists a regular expression for every equivalence class in @{term "UNIV // \A"}. Since @{text "finals A"} is a subset of @{term "UNIV // \A"}, we also know that for every equivalence class in @{term "finals A"} there exists a regular expression. Moreover by assumption @@ -1175,11 +1174,10 @@ \end{dfntn} \noindent - For constructing @{text R} will - rely on some \emph{tagging-functions} defined over strings. Given the - inductive hypothesis, it will be easy to prove that the \emph{range} of - these tagging-functions is finite. The range of a function @{text f} is - defined as + For constructing @{text R}, will rely on some \emph{tagging-functions} + defined over strings. Given the inductive hypothesis, it will be easy to + prove that the \emph{range} of these tagging-functions is finite. The range + of a function @{text f} is defined as \begin{center} @{text "range f \ f ` UNIV"} @@ -1225,10 +1223,11 @@ finite by assumption. Now @{term "f (UNIV // =tag=)"} is a subset of @{text "range f"}, and so also finite. Injectivity amounts to showing that @{text "X = Y"} under the assumptions that @{text "X, Y \ "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}. - From the assumptions we can obtain @{text "x \ X"} and @{text "y \ Y"} with + From the assumptions we obtain \mbox{@{text "x \ X"}} and @{text "y \ Y"} with @{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in turn means that the equivalence classes @{text X} - and @{text Y} must be equal. + and @{text Y} must be equal. Therefore \eqref{finiteimageD} allows us to conclude + with @{thm (concl) finite_eq_tag_rel}. \end{proof} \begin{lmm}\label{fintwo} @@ -1256,7 +1255,7 @@ \end{proof} \noindent - Chaining Lem.~\ref{finone} and \ref{fintwo} together, means in order to show + Chaining Lemma~\ref{finone} and \ref{fintwo} together, means in order to show that @{term "UNIV // \(lang r)"} is finite, we have to construct a tagging-function whose range can be shown to be finite and whose tagging-relation refines @{term "\(lang r)"}. Let us attempt the @{const PLUS}-case first. We take as tagging-function @@ -1310,7 +1309,7 @@ and respectively to @{text "x\<^isub>s"} as the \emph{suffix}. - Now assuming @{term "x @ z \ A \ B"} there are only two possible ways of how to `split' + Now assuming @{term "x @ z \ A \ B"}, there are only two possible ways of how to `split' this string to be in @{term "A \ B"}: % \begin{center} @@ -1391,7 +1390,7 @@ \end{center} \noindent - We have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do + Note that we have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do not know anything about how the string @{term x} is partitioned. With this definition in place, let us prove the @{const "Times"}-case. @@ -1502,7 +1501,7 @@ There are potentially many such prefixes, but there can only be finitely many of them (the string @{text x} is finite). Let us therefore choose the longest one and call it @{text "x\<^bsub>pmax\<^esub>"}. Now for the rest of the string @{text "x\<^isub>s @ z"} we - know it is in @{term "A\"} and cannot be the empty string. By Prop.~\ref{langprops}@{text "(iv)"}, + know it is in @{term "A\"} and cannot be the empty string. By Property~\ref{langprops}@{text "(iv)"}, we can separate this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \ []"}, @{text "a \ A"} and @{term "b \ A\"}. Now @{text a} must be strictly longer than @{text "x\<^isub>s"}, @@ -1616,16 +1615,19 @@ \end{equation} \noindent - where @{text "\"} in the fifth line is a function that tests whether the empty string - is in the language and returns @{term "{[]}"} or @{term "{}"}, accordingly. - The only interesting case is the last one where we use Prop.~\ref{langprops}@{text "(i)"} - in order to infer that @{term "Der c (A\) = Der c (A \ A\)"}. We can - then complete the proof by noting that @{term "Delta A \ Der c (A\) \ (Der c A) \ A\"}. - - Brzozowski observed that the left-quotients for languages of regular - expressions can be calculated directly using the notion of \emph{derivatives - of a regular expression} \cite{Brzozowski64}. We define this notion in - Isabelle/HOL as follows: + where @{text "\"} in the fifth line is a function that tests whether the + empty string is in the language and returns @{term "{[]}"} or @{term "{}"}, + accordingly. In the last equation we use the list-cons operator written + \mbox{@{text "_ :: _"}}. The only interesting case is the @{text "A\"}-case + where we use Property~\ref{langprops}@{text "(i)"} in order to infer that + @{term "Der c (A\) = Der c (A \ A\)"}. We can then complete the proof by + using the fifth equation and noting that @{term "Delta A \ Der c (A\) \ (Der + c A) \ A\"}. + + Brzozowski observed that the left-quotients for languages of + regular expressions can be calculated directly using the notion of + \emph{derivatives of a regular expression} \cite{Brzozowski64}. We define + this notion in Isabelle/HOL as follows: \begin{center} \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}} @@ -1646,8 +1648,7 @@ \end{center} \noindent - The last two clauses extend derivatives from characters to strings---i.e.~list of - characters. The list-cons operator is written \mbox{@{text "_ :: _"}}. The + The last two clauses extend derivatives from characters to strings. The boolean function @{term "nullable r"} needed in the @{const Times}-case tests whether a regular expression can recognise the empty string. It can be defined as follows. @@ -1672,7 +1673,7 @@ \noindent By induction on the regular expression @{text r}, respectively on the string @{text s}, one can easily show that left-quotients and derivatives of regular expressions - relate as follows (for example \cite{Sakarovitch09}): + relate as follows (see for example~\cite{Sakarovitch09}): \begin{equation}\label{Dersders} \mbox{\begin{tabular}{c} @@ -1703,7 +1704,7 @@ classes. Unfortunately, this is not true in general. Sakarovitch gives an example where a regular expression has infinitely many derivatives w.r.t.~the language \mbox{@{term "({a} \ {b})\ \ ({a} \ {b})\ \ {a}"}} - \cite[Page~141]{Sakarovitch09}. + (see \cite[Page~141]{Sakarovitch09}). What Brzozowski \cite{Brzozowski64} established is that for every language there @@ -1783,7 +1784,7 @@ taking the partial derivatives of the regular expressions in \eqref{ACI} gives us in each case equal sets. Antimirov \cite{Antimirov95} showed a similar result to - \eqref{Dersders} for partial derivatives: + \eqref{Dersders} for partial derivatives, namely \begin{equation}\label{Derspders} \mbox{\begin{tabular}{lc} @@ -1855,7 +1856,7 @@ Antimirov's argument first shows this theorem for the language @{term UNIV1}, which is the set of all non-empty strings. For this he proves: - \begin{equation} + \begin{equation}\label{pdersunivone} \mbox{\begin{tabular}{l} @{thm pders_lang_Zero}\\ @{thm pders_lang_One}\\ @@ -1883,10 +1884,11 @@ \noindent and for all languages @{text "A"}, @{thm pders_lang_subset[where B="UNIV", simplified]} holds. Since we follow Antimirov's proof quite closely in our - formalisation, we omit the details. + formalisation (only the last two cases of \eqref{pdersunivone} involve some + non-routine induction argument), we omit the details. - Let us return to our proof of the second direction in the Myhill-Nerode + Let us now return to our proof about the second direction in the Myhill-Nerode theorem. The point of the above calculations is to use @{text "\<^raw:$\threesim$>\<^bsub>(\x. ders x r)\<^esub>"} as tagging-relation. @@ -1915,7 +1917,8 @@ Now the range of @{term "\x. pders x r"} is a subset of @{term "Pow (pders_lang UNIV r)"}, which we know is finite by Theorem~\ref{antimirov}. This means there are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\x. ders x r)\<^esub>"}, - and we can again conclude the second part of the Myhill-Nerode theorem. + which refines @{term "\(lang r)"}, and consequently we can again conclude the + second part of the Myhill-Nerode theorem. \end{proof} *} @@ -1929,7 +1932,7 @@ More interesting is the closure under complement, because it seems difficult to construct a regular expression for the complement language by direct means. However the existence of such a regular expression can now be easily - proved using the Myhill-Nerode theorem since + proved using both parts of the Myhill-Nerode theorem, since \begin{center} @{term "s\<^isub>1 \A s\<^isub>2"} if and only if @{term "s\<^isub>1 \(-A) s\<^isub>2"} @@ -1939,7 +1942,7 @@ holds for any strings @{text "s\<^isub>1"} and @{text "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"} give rise to the same partitions. So if one is finite, the other is too, and - the other way around. Proving the existence of such a regular expression via + vice versa. Proving the existence of such a regular expression via automata using the standard method would be quite involved. It includes the steps: regular expression @{text "\"} non-deterministic automaton @{text "\"} deterministic automaton @{text "\"} complement automaton @{text "\"} @@ -2004,7 +2007,7 @@ a regular expression @{text r} such that @{term "A = lang r"}. We also know that @{term "pders_lang B r"} is finite for every language @{text B} and regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition - and Lemma~\ref{Derspders} therefore + and \eqref{Derspders} therefore \begin{equation}\label{eqq} @@ -2013,8 +2016,8 @@ \noindent Since there are only finitely many regular expressions in @{term "pders_lang - B r"}, we know by \eqref{uplus} that there exists a regular expression that - the right-hand side of \eqref{eqq} is equal to \mbox{@{term "lang (\(pders_lang B + B r"}, we know by \eqref{uplus} that there exists a regular expression so that + the right-hand side of \eqref{eqq} is equal to the language \mbox{@{term "lang (\(pders_lang B r))"}}. Thus the regular expression @{term "\(pders_lang B r)"} verifies that @{term "Ders_lang B A"} is regular. *} @@ -2040,7 +2043,7 @@ far. Using this theorem we can obviously prove when a language is \emph{not} regular---by establishing that it has infinitely many equivalence classes generated by the Myhill-Nerode relation (this is usually the purpose of the - pumping lemma \cite{Kozen97}). We can also use it to establish the standard + Pumping Lemma \cite{Kozen97}). We can also use it to establish the standard textbook results about closure properties of regular languages. Interesting is the case of closure under complement, because it seems difficult to construct a regular expression for the complement language by direct @@ -2048,13 +2051,13 @@ proved using the Myhill-Nerode theorem. Our insistence on regular expressions for proving the Myhill-Nerode theorem - arose from the limitations of HOL on which the popular theorem provers HOL4, + arose from the limitations of HOL, on which the popular theorem provers HOL4, HOLlight and Isabelle/HOL are based. In order to guarantee consistency, - formalisations can only extend HOL by definitions that introduce a notion in - terms of already existing concepts. A convenient definition for automata + formalisations can only extend HOL by definitions that introduce a new concept in + terms of already existing notions. A convenient definition for automata (based on graphs) uses a polymorphic type for the state nodes. This allows us to use the standard operation of disjoint union in order to compose two - automata. Unfortunately, we cannot use such a polymorphic definition of + automata. Unfortunately, we cannot use such a polymorphic definition in HOL as part of the definition for regularity of a language (a set of strings). Consider the following attempt @@ -2065,17 +2068,18 @@ \noindent which means the definiens is polymorphic in the type of the automata @{text "M"}, but the definiendum @{text "is_regular"} is not. Such definitions are - excluded in HOL, because they lead easily to inconsistencies (see + excluded in HOL, because they can lead easily to inconsistencies (see \cite{PittsHOL4} for a simple example). Also HOL does not contain type-quantifiers which would allow us to get rid of the polymorphism by quantifying over the type-variable @{text "\"}. Therefore when defining - regularity in terms of automata, the only natural way out in HOL is to use state - nodes with an identity, for example a natural number. Unfortunatly, the - consequence is that we have to be careful when combining two automata so + regularity in terms of automata, the only natural way out in HOL is to use + state nodes with an identity, for example a natural number. Unfortunatly, + the consequence is that we have to be careful when combining two automata so that there is no clash between two such states. This makes formalisations - quite fiddly and unpleasant. Regular expressions proved much more convenient - for reasoning in HOL and we showed they can be used for establishing the - Myhill-Nerode theorem. + quite fiddly and rather unpleasant. Regular expressions proved much more + convenient for reasoning in HOL and we showed they can be used for + establishing the central result in regular language theory: the Myhill-Nerode + theorem. While regular expressions are convenient, they have some limitations. One is that there seems to be no method of calculating a minimal regular expression @@ -2086,32 +2090,6 @@ been formally established, we refer the reader to Owens and Slind \cite{OwensSlind08}. - Our formalisation consists of 780 lines of Isabelle/Isar code for the first - direction and 460 for the second, plus around 300 lines of standard material - about regular languages. The formalisation about derivatives and partial - derivatives shown in Section~\ref{derivatives} consists of 390 lines of - code. The algorithm for solving equational systems, which we used in the - first direction, is conceptually not that complicated. Still the use of sets - over which the algorithm operates, means it is not as easy to formalise as - one might wish. It seems sets cannot be avoided since the `input' of the - algorithm consists of equivalence classes and we cannot see how to - reformulate the theory so that we can use lists, which are usually easier to - reason about in a theorem prover. - - While our formalisation might be seen large, it should be seen - in the context of the work done by Constable at al \cite{Constable00} who - formalised the Myhill-Nerode theorem in Nuprl using automata. They write - that their four-member team needed something on the magnitude of 18 months - for their formalisation. The estimate for our formalisation is that we - needed approximately 3 months and this included the time to find our proof - arguments. Unlike Constable et al, who were able to follow the proofs from - \cite{HopcroftUllman69}, we had to find our own arguments. So for us the - formalisation was not the bottleneck. It is hard to gauge the size of a - formalisation in Nurpl, but from what is shown in the Nuprl Math Library - about their development it seems substantially larger than ours. The code of - ours can be found in the Mercurial Repository at - \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}. - Our proof of the first direction is very much inspired by \emph{Brzozowski's algebraic method} used to convert a finite automaton to a regular expression \cite{Brzozowski64}. The close connection can be seen by considering the @@ -2130,31 +2108,63 @@ We presented two proofs for the second direction of the Myhill-Nerode theorem. One direct proof using tagging-functions and another using partial - derivatives. These proofs is where our method shines, because we can - completely side-step the standard argument \cite{Kozen97} where automata - need to be composed. However, it is also the direction where we had to spend - most of the `conceptual' time, as our first proof based on - tagging-functions is new for establishing the Myhill-Nerode theorem. All - standard proofs of this direction proceed by arguments over automata. + derivatives. This part of our work is where our method using regular + expressions shines, because we can completely side-step the standard + argument \cite{Kozen97} where automata need to be composed. However, it is + also the direction where we had to spend most of the `conceptual' time, as + our first proof based on tagging-functions is new for establishing the + Myhill-Nerode theorem. All standard proofs of this direction proceed by + arguments over automata. - Our indirect proof for the second direction arose from the interest in + The indirect proof for the second direction arose from our interest in Brzozowski's derivatives for regular expression matching. A corresponding - regular expression matcher has been formalised in HOL4 in - \cite{OwensSlind08}. In our opinion, this formalisation is considerably + regular expression matcher has been formalised by Owens and Slind in HOL4 + \cite{OwensSlind08}. In our opinion, their formalisation is considerably slicker than for example the approach to regular expression matching taken - in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to - simple regular expression matchers and he proved that there are only + in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to a + simple regular expression matcher and he established that there are only finitely many dissimilar derivatives for every regular expression, this result is not as straightforward to formalise in a theorem prover. The reason is that the set of dissimilar derivatives is not defined inductively, - but in terms of an ACI-equivalence relation. - - + but in terms of an ACI-equivalence relation. This difficulty prevented for + example Krauss and Nipkow to prove termination of their equivalence checker + for regular expressions \cite{KraussNipkow11}. Their checker is based on + derivatives and for their argument the lack of a formal proof of termination + is not crucial (it merely lets them ``sleep better'' \cite{KraussNipkow11}). + We expect that their development simplifies by using partial derivatives, + instead of derivatives, and that termination of the algorithm can be + formally established. However, since partial derivatives use sets of regular + expressions, one needs to carefully analyse whether the resulting algorithm + is still executable. Given the existing infrastructure for executable sets + in Isabelle/HOL, it should. - \medskip + Our formalisation of the Myhill-Nerode theorem consists of 780 lines of + Isabelle/Isar code for the first direction and 460 for the second (the one + based on tagging functions), plus around 300 lines of standard material + about regular languages. The formalisation about derivatives and partial + derivatives shown in Section~\ref{derivatives} consists of 390 lines of + code. The algorithm for solving equational systems, which we used in the + first direction, is conceptually relatively simple. Still the use of sets + over which the algorithm operates means it is not as easy to formalise as + one might hope. However, it seems sets cannot be avoided since the `input' + of the algorithm consists of equivalence classes and we cannot see how to + reformulate the theory so that we can use lists. Lists would be much easier + to reason about, since we can define function over them by recursion. For + sets we have to use set-comprehensions. - We expect that the development of Krauss \& Nipkow gets easier by - using partial derivatives.\medskip + While our formalisation might be seen large, it should be seen + in the context of the work done by Constable at al \cite{Constable00} who + formalised the Myhill-Nerode theorem in Nuprl using automata. They write + that their four-member team needed something on the magnitude of 18 months + for their formalisation. The estimate for our formalisation is that we + needed approximately 3 months and this included the time to find our proof + arguments. Unlike Constable et al, who were able to follow the proofs from + \cite{HopcroftUllman69}, we had to find our own arguments. So for us the + formalisation was not the bottleneck. It is hard to gauge the size of a + formalisation in Nurpl, but from what is shown in the Nuprl Math Library + about their development it seems substantially larger than ours. The code of + ours can be found in the Mercurial Repository at + \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.\medskip \noindent {\bf Acknowledgements:}