Journal/Paper.thy
changeset 350 8ce9a432680b
parent 348 bea94f1e6771
child 372 2c56b20032a7
equal deleted inserted replaced
349:dae7501b26ac 350:8ce9a432680b
   210   \noindent
   210   \noindent
   211   Regular languages are an important and well-understood subject in Computer
   211   Regular languages are an important and well-understood subject in Computer
   212   Science, with many beautiful theorems and many useful algorithms. There is a
   212   Science, with many beautiful theorems and many useful algorithms. There is a
   213   wide range of textbooks on this subject, many of which are aimed at students
   213   wide range of textbooks on this subject, many of which are aimed at students
   214   and contain very detailed `pencil-and-paper' proofs 
   214   and contain very detailed `pencil-and-paper' proofs 
   215   (e.g.~\cite{HopcroftUllman69,Kozen97}). It seems natural to exercise theorem provers by
   215   (e.g.~the textbooks by \citeN{HopcroftUllman69} and by \citeN{Kozen97}). 
       
   216   It seems natural to exercise theorem provers by
   216   formalising the theorems and by verifying formally the algorithms.  
   217   formalising the theorems and by verifying formally the algorithms.  
   217 
   218 
   218   A popular choice for a theorem prover would be one based on Higher-Order
   219   A popular choice for a theorem prover would be one based on Higher-Order
   219   Logic (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development
   220   Logic (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development
   220   presented in this paper we will use the Isabelle/HOL. HOL is a predicate calculus
   221   presented in this paper we will use the Isabelle/HOL. HOL is a predicate calculus
   221   that allows quantification over predicate variables. Its type system is
   222   that allows quantification over predicate variables. Its type system is
   222   based on Church's Simple Theory of Types \cite{Church40}.  Although many
   223   based on the Simple Theory of Types by \citeN{Church40}.  Although many
   223   mathematical concepts can be conveniently expressed in HOL, there are some
   224   mathematical concepts can be conveniently expressed in HOL, there are some
   224   limitations that hurt badly when attempting a simple-minded formalisation
   225   limitations that hurt when attempting a simple-minded formalisation
   225   of regular languages in it. 
   226   of regular languages in it. 
   226 
   227 
   227   The typical approach (for example \cite{HopcroftUllman69,Kozen97}) to
   228   The typical approach to
   228   regular languages is to introduce finite deterministic automata and then
   229   regular languages, taken for example by \citeN{HopcroftUllman69}
   229   define everything in terms of them.  For example, a regular language is
   230   and by \citeN{Kozen97},  is to introduce finite deterministic automata and then
       
   231   define most notions in terms of them.  For example, a regular language is
   230   normally defined as:
   232   normally defined as:
   231 
   233 
   232   \begin{dfntn}\label{baddef}
   234   \begin{dfntn}\label{baddef}
   233   A language @{text A} is \emph{regular}, provided there is a 
   235   A language @{text A} is \emph{regular}, provided there is a 
   234   finite deterministic automaton that recognises all strings of @{text "A"}.
   236   finite deterministic automaton that recognises all strings of @{text "A"}.
   239   convince oneself that regular languages are closed under complementation:
   241   convince oneself that regular languages are closed under complementation:
   240   one just has to exchange the accepting and non-accepting states in the
   242   one just has to exchange the accepting and non-accepting states in the
   241   corresponding automaton to obtain an automaton for the complement language.
   243   corresponding automaton to obtain an automaton for the complement language.
   242   The problem, however, lies with formalising such reasoning in a 
   244   The problem, however, lies with formalising such reasoning in a 
   243   theorem prover. Automata are built up from states and transitions that are 
   245   theorem prover. Automata are built up from states and transitions that are 
   244   usually represented as graphs, matrices or functions, none of which can be
   246   commonly represented as graphs, matrices or functions, none of which, unfortunately, 
   245   defined as an inductive datatype.
   247   can be defined as an inductive datatype.
   246 
   248 
   247   In case of graphs and matrices, this means we have to build our own
   249   In case of graphs and matrices, this means we have to build our own
   248   reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
   250   reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
   249   HOLlight support them with libraries. Even worse, reasoning about graphs and
   251   HOLlight support them with libraries. Also, reasoning about graphs and
   250   matrices can be a real hassle in theorem provers, because
   252   matrices can be a hassle in theorem provers, because
   251   we have to be able to combine automata.  Consider for
   253   we have to be able to combine automata.  Consider for
   252   example the operation of sequencing two automata, say $A_1$ and $A_2$, by
   254   example the operation of sequencing two automata, say $A_1$ and $A_2$, by
   253   connecting the accepting states of $A_1$ to the initial state of $A_2$:
   255   connecting the accepting states of $A_1$ to the initial state of $A_2$:
   254 
   256 
   255   \begin{center}
   257   \begin{center}
   324   an automaton that recognises all its strings (Definition~\ref{baddef}). This
   326   an automaton that recognises all its strings (Definition~\ref{baddef}). This
   325   is because we cannot make a definition in HOL that is only polymorphic in
   327   is because we cannot make a definition in HOL that is only polymorphic in
   326   the state type, but not in the predicate for regularity; and there is no
   328   the state type, but not in the predicate for regularity; and there is no
   327   type quantification available in HOL (unlike in Coq, for
   329   type quantification available in HOL (unlike in Coq, for
   328   example).\footnote{Slind already pointed out this problem in an email to the
   330   example).\footnote{Slind already pointed out this problem in an email to the
   329   HOL4 mailing list on 21st April 2005.}$^,$\footnote{While in Coq one can avoid 
   331   HOL4 mailing list on 21st April 2005.}
   330   this particular problem, all other difficulties we point out below still apply.}
   332   %$^,$\footnote{While in Coq one can avoid 
       
   333   %this particular problem, all other difficulties we point out below still apply.}
   331 
   334 
   332   An alternative, which provides us with a single type for states of automata,
   335   An alternative, which provides us with a single type for states of automata,
   333   is to give every state node an identity, for example a natural number, and
   336   is to give every state node an identity, for example a natural number, and
   334   then be careful to rename these identities apart whenever connecting two
   337   then be careful to rename these identities apart whenever connecting two
   335   automata. This results in clunky proofs establishing that properties are
   338   automata. This results in clunky proofs establishing that properties are
   338   reason about.
   341   reason about.
   339 
   342 
   340   Functions are much better supported in Isabelle/HOL, but they still lead to
   343   Functions are much better supported in Isabelle/HOL, but they still lead to
   341   similar problems as with graphs.  Composing, for example, two
   344   similar problems as with graphs.  Composing, for example, two
   342   non-deterministic automata in parallel requires also the formalisation of
   345   non-deterministic automata in parallel requires also the formalisation of
   343   disjoint unions. Nipkow \cite{Nipkow98} dismisses for this the option of
   346   disjoint unions. \citeN{Nipkow98} dismisses for this the option of
   344   using identities, because it leads according to him to ``messy
   347   using identities, because it leads according to him to ``messy
   345   proofs''. Since he does not need to define what regular languages are,
   348   proofs''. Since he does not need to define what regular languages are,
   346   Nipkow opts for a variant of \eqref{disjointunion} using bit lists, but
   349   Nipkow opts for a variant of \eqref{disjointunion} using bit lists, but
   347   writes\smallskip
   350   writes\smallskip
   348 
   351 
   364   `` & If the reader finds the above treatment in terms of bit lists revoltingly
   367   `` & If the reader finds the above treatment in terms of bit lists revoltingly
   365        concrete, I cannot disagree. A more abstract approach is clearly desirable.''
   368        concrete, I cannot disagree. A more abstract approach is clearly desirable.''
   366   \end{tabular}
   369   \end{tabular}
   367   \end{quote}\smallskip
   370   \end{quote}\smallskip
   368 
   371 
   369 
   372   %\noindent
   370   \noindent
   373   %Moreover, it is not so clear how to conveniently impose a finiteness
   371   Moreover, it is not so clear how to conveniently impose a finiteness
   374   %condition upon functions in order to represent \emph{finite} automata. The
   372   condition upon functions in order to represent \emph{finite} automata. The
   375   %best is probably to resort to more advanced reasoning frameworks, such as
   373   best is probably to resort to more advanced reasoning frameworks, such as
   376   %\emph{locales} or \emph{type classes}, which are \emph{not} available in all
   374   \emph{locales} or \emph{type classes}, which are \emph{not} available in all
   377   %HOL-based theorem provers.
   375   HOL-based theorem provers.
       
   376 
   378 
   377   Because of these problems to do with representing automata, there seems to
   379   Because of these problems to do with representing automata, there seems to
   378   be no substantial formalisation of automata theory and regular languages
   380   be no substantial formalisation of automata theory and regular languages
   379   carried out in HOL-based theorem provers. Nipkow \cite{Nipkow98} establishes
   381   carried out in HOL-based theorem provers. \citeN{Nipkow98} establishes
   380   the link between regular expressions and automata in the context of
   382   the link between regular expressions and automata in the context of
   381   lexing. Berghofer and Reiter \cite{BerghoferReiter09} formalise automata
   383   lexing. \citeN{BerghoferReiter09} formalise automata
   382   working over bit strings in the context of Presburger arithmetic.  The only
   384   working over bit strings in the context of Presburger arithmetic.  The only
   383   larger formalisations of automata theory are carried out in Nuprl
   385   larger formalisations of automata theory are carried out in Nuprl by
   384   \cite{Constable00} and in Coq, e.g.~\cite{Almeidaetal10,Filliatre97}.
   386   \citeN{Constable00} and in Coq by \citeN{Filliatre97}, \citeN{Almeidaetal10}
   385 
   387   and \citeN{Braibant12}.
   386   Also, one might consider automata as just convenient `vehicles' for
   388 
   387   establishing properties about regular languages.  However, paper proofs
   389   %Also, one might consider automata as just convenient `vehicles' for
   388   about automata often involve subtle side-conditions which are easily
   390   %establishing properties about regular languages.  However, paper proofs
   389   overlooked, but which make formal reasoning rather painful. For example
   391   %about automata often involve subtle side-conditions which are easily
   390   Kozen's proof of the Myhill-Nerode Theorem requires that automata do not
   392   %overlooked, but which make formal reasoning rather painful. For example
   391   have inaccessible states \cite{Kozen97}. Another subtle side-condition is
   393   %Kozen's proof of the Myhill-Nerode Theorem requires that automata do not
   392   completeness of automata, that is automata need to have total transition
   394   %have inaccessible states \cite{Kozen97}. Another subtle side-condition is
   393   functions and at most one `sink' state from which there is no connection to
   395   %completeness of automata, that is automata need to have total transition
   394   a final state (Brzozowski mentions this side-condition in the context of
   396   %functions and at most one `sink' state from which there is no connection to
   395   state complexity of automata \cite{Brzozowski10}, but it is also needed
   397   %a final state (Brzozowski mentions this side-condition in the context of
   396   in the usual construction of the complement automaton). Such side-conditions mean
   398   %state complexity of automata \cite{Brzozowski10}, but it is also needed
   397   that if we define a regular language as one for which there exists \emph{a}
   399   %in the usual construction of the complement automaton). Such side-conditions mean
   398   finite automaton that recognises all its strings (see
   400   %that if we define a regular language as one for which there exists \emph{a}
   399   Definition~\ref{baddef}), then we need a lemma which ensures that another
   401   %finite automaton that recognises all its strings (see
   400   equivalent one can be found satisfying the side-condition, and also need to
   402   %Definition~\ref{baddef}), then we need a lemma which ensures that another
   401   make sure our operations on automata preserve them. Unfortunately, such
   403   %equivalent one can be found satisfying the side-condition, and also need to
   402   `little' and `obvious' lemmas make formalisations of automata theory 
   404   %make sure our operations on automata preserve them. Unfortunately, such
   403   hair-pulling experiences.
   405   %`little' and `obvious' lemmas make formalisations of automata theory 
       
   406   %hair-pulling experiences.
   404 
   407 
   405   In this paper, we will not attempt to formalise automata theory in
   408   In this paper, we will not attempt to formalise automata theory in
   406   Isabelle/HOL nor will we attempt to formalise automata proofs from the
   409   Isabelle/HOL nor will we attempt to formalise automata proofs from the
   407   literature, but take a different approach to regular languages than is
   410   literature, but take a different approach to regular languages than is
   408   usually taken. Instead of defining a regular language as one where there
   411   usually taken. Instead of defining a regular language as one where there
   414   that matches all strings of @{text "A"}.
   417   that matches all strings of @{text "A"}.
   415   \end{dfntn}
   418   \end{dfntn}
   416   
   419   
   417   \noindent
   420   \noindent
   418   And then `forget' automata completely.
   421   And then `forget' automata completely.
   419   The reason is that regular expressions, unlike graphs, matrices and
   422   The reason is that regular expressions %, unlike graphs, matrices and
   420   functions, can be easily defined as an inductive datatype. A reasoning
   423   %functions, 
   421   infrastructure (like induction and recursion) comes for free in
   424   can be defined as an inductive datatype and a reasoning
   422   HOL. Moreover, no side-conditions will be needed for regular expressions,
   425   infrastructure for them (like induction and recursion) comes for free in
   423   like we need for automata. This convenience of regular expressions has
   426   HOL. %Moreover, no side-conditions will be needed for regular expressions,
       
   427   %like we need for automata. 
       
   428   This convenience of regular expressions has
   424   recently been exploited in HOL4 with a formalisation of regular expression
   429   recently been exploited in HOL4 with a formalisation of regular expression
   425   matching based on derivatives \cite{OwensSlind08} and with an equivalence
   430   matching based on derivatives by \citeN{OwensSlind08}, and with an equivalence
   426   checker for regular expressions in Isabelle/HOL \cite{KraussNipkow11}.  The
   431   checker for regular expressions in Isabelle/HOL by \citeN{KraussNipkow11}.  The
   427   main purpose of this paper is to show that a central result about regular
   432   main purpose of this paper is to show that a central result about regular
   428   languages---the Myhill-Nerode Theorem---can be recreated by only using
   433   languages---the Myhill-Nerode Theorem---can be recreated by only using
   429   regular expressions. This theorem gives necessary and sufficient conditions
   434   regular expressions. This theorem gives necessary and sufficient conditions
   430   for when a language is regular. As a corollary of this theorem we can easily
   435   for when a language is regular. As a corollary of this theorem we can easily
   431   establish the usual closure properties, including complementation, for
   436   establish the usual closure properties, including complementation, for
   432   regular languages. We use the Continuation Lemma \cite{Rosenberg06}, 
   437   regular languages. We use the Continuation Lemma, which is also a corollary 
   433   which is also a corollary of the Myhill-Nerode Theorem, for establishing 
   438   of the Myhill-Nerode Theorem, for establishing 
   434   the non-regularity of the language @{text "a\<^isup>nb\<^isup>n"}.\medskip
   439   the non-regularity of the language @{text "a\<^isup>nb\<^isup>n"}.\medskip 
   435   
   440   
   436   \noindent 
   441   \noindent 
   437   {\bf Contributions:} There is an extensive literature on regular languages.
   442   {\bf Contributions:} There is an extensive literature on regular languages.
   438   To our best knowledge, our proof of the Myhill-Nerode Theorem is the first
   443   To our best knowledge, our proof of the Myhill-Nerode Theorem is the first
   439   that is based on regular expressions, only. The part of this theorem stating
   444   that is based on regular expressions, only. The part of this theorem stating
   440   that finitely many partitions imply regularity of the language is proved by
   445   that finitely many partitions imply regularity of the language is proved by
   441   an argument about solving equational systems.  This argument appears to be
   446   an argument about solving equational systems.  This argument appears to be
   442   folklore. For the other part, we give two proofs: one direct proof using
   447   folklore. For the other part, we give two proofs: one direct proof using
   443   certain tagging-functions, and another indirect proof using Antimirov's
   448   certain tagging-functions, and another indirect proof using Antimirov's
   444   partial derivatives \cite{Antimirov95}. Again to our best knowledge, the
   449   partial derivatives \citeyear{Antimirov95}. Again to our best knowledge, the
   445   tagging-functions have not been used before for establishing the Myhill-Nerode
   450   tagging-functions have not been used before for establishing the Myhill-Nerode
   446   Theorem. Derivatives of regular expressions have been used recently quite
   451   Theorem. Derivatives of regular expressions have been used recently quite
   447   widely in the literature; partial derivatives, in contrast, attract much
   452   widely in the literature; partial derivatives, in contrast, attract much
   448   less attention. However, partial derivatives are more suitable in the
   453   less attention. However, partial derivatives are more suitable in the
   449   context of the Myhill-Nerode Theorem, since it is easier to establish
   454   context of the Myhill-Nerode Theorem, since it is easier to establish
   476   \end{tabular}
   481   \end{tabular}
   477   \end{center}
   482   \end{center}
   478 
   483 
   479   \noindent
   484   \noindent
   480   where @{text "@"} is the list-append operation. The Kleene-star of a language @{text A}
   485   where @{text "@"} is the list-append operation. The Kleene-star of a language @{text A}
   481   is defined as the union over all powers, namely @{thm star_def}. In the paper
   486   is defined as the union over all powers, namely @{thm star_def[where A="A", THEN eq_reflection]}. 
       
   487   In the paper
   482   we will make use of the following properties of these constructions.
   488   we will make use of the following properties of these constructions.
   483   
   489   
   484   \begin{prpstn}\label{langprops}\mbox{}\\
   490   \begin{prpstn}\label{langprops}\mbox{}\\
   485   \begin{tabular}{@ {}lp{10cm}}
   491   \begin{tabular}{@ {}lp{10cm}}
   486   (i)   & @{thm star_unfold_left}     \\ 
   492   (i)   & @{thm star_unfold_left}     \\ 
   498   the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}.  
   504   the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}.  
   499   Property @{text "(iv)"} states that a non-empty string in @{term "A\<star>"} can
   505   Property @{text "(iv)"} states that a non-empty string in @{term "A\<star>"} can
   500   always be split up into a non-empty prefix belonging to @{text "A"} and the
   506   always be split up into a non-empty prefix belonging to @{text "A"} and the
   501   rest being in @{term "A\<star>"}. We omit
   507   rest being in @{term "A\<star>"}. We omit
   502   the proofs for these properties, but invite the reader to consult our
   508   the proofs for these properties, but invite the reader to consult our
   503   formalisation.\footnote{Available in the Archive of Formal Proofs at 
   509   formalisation.\footnote{Available under \citeN{myhillnerodeafp11} in the Archive of Formal Proofs at\\ 
   504   \url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml} 
   510   \url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}.}
   505   \cite{myhillnerodeafp11}.}
       
   506 
   511 
   507   The notation in Isabelle/HOL for the quotient of a language @{text A}
   512   The notation in Isabelle/HOL for the quotient of a language @{text A}
   508   according to an equivalence relation @{term REL} is @{term "A // REL"}. We
   513   according to an equivalence relation @{term REL} is @{term "A // REL"}. We
   509   will write @{text "\<lbrakk>x\<rbrakk>\<^isub>\<approx>"} for the equivalence class defined as
   514   will write @{text "\<lbrakk>x\<rbrakk>\<^isub>\<approx>"} for the equivalence class defined as
   510   \mbox{@{text "{y | y \<approx> x}"}}, and have @{text "x \<approx> y"} if and only if @{text
   515   \mbox{@{text "{y | y \<approx> x}"}}, and have @{text "x \<approx> y"} if and only if @{text
   515   involving equivalence classes of languages. For this we will use Arden's Lemma 
   520   involving equivalence classes of languages. For this we will use Arden's Lemma 
   516   (see for example \cite[Page 100]{Sakarovitch09}),
   521   (see for example \cite[Page 100]{Sakarovitch09}),
   517   which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
   522   which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
   518   @{term "[] \<notin> A"}. However we will need the following `reversed' 
   523   @{term "[] \<notin> A"}. However we will need the following `reversed' 
   519   version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \<cdot> X"} to
   524   version of Arden's Lemma (`reversed' in the sense of changing the order of @{term "A \<cdot> X"} to
   520   \mbox{@{term "X \<cdot> A"}}).\footnote{The details of its proof are given in the Appendix.}
   525   \mbox{@{term "X \<cdot> A"}}).\footnote{The details of the proof for the reversed Arden's Lemma
       
   526   are given in the Appendix.}
   521 
   527 
   522   \begin{lmm}[(Reversed Arden's Lemma)]\label{arden}\mbox{}\\
   528   \begin{lmm}[(Reversed Arden's Lemma)]\label{arden}\mbox{}\\
   523   If @{thm (prem 1) reversed_Arden} then
   529   If @{thm (prem 1) reversed_Arden} then
   524   @{thm (lhs) reversed_Arden} if and only if
   530   @{thm (lhs) reversed_Arden} if and only if
   525   @{thm (rhs) reversed_Arden}.
   531   @{thm (rhs) reversed_Arden}.
   538    & @{text"|"} & @{term "Star r"}
   544    & @{text"|"} & @{term "Star r"}
   539   \end{tabular}
   545   \end{tabular}
   540   \end{center}
   546   \end{center}
   541 
   547 
   542   \noindent
   548   \noindent
   543   and the language matched by a regular expression is defined as
   549   and the language matched by a regular expression is defined by recursion as
   544 
   550 
   545   \begin{center}
   551   \begin{center}
   546   \begin{tabular}{r@ {\hspace{2mm}}c@ {\hspace{2mm}}l}
   552   \begin{tabular}{r@ {\hspace{2mm}}c@ {\hspace{2mm}}l}
   547   @{thm (lhs) lang.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) lang.simps(1)}\\
   553   @{thm (lhs) lang.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) lang.simps(1)}\\
   548   @{thm (lhs) lang.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) lang.simps(2)}\\
   554   @{thm (lhs) lang.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) lang.simps(2)}\\
   556   \end{tabular}
   562   \end{tabular}
   557   \end{center}
   563   \end{center}
   558 
   564 
   559   Given a finite set of regular expressions @{text rs}, we will make use of the operation of generating 
   565   Given a finite set of regular expressions @{text rs}, we will make use of the operation of generating 
   560   a regular expression that matches the union of all languages of @{text rs}. 
   566   a regular expression that matches the union of all languages of @{text rs}. 
   561   This definion is not trivial in a theorem prover, but since 
   567   This definion is not trivial in a theorem prover, because @{text rs} (being a set) is unordered, 
       
   568   but the regular expression needs an order. Since 
   562   we only need to know the 
   569   we only need to know the 
   563   existence
   570   existence
   564   of such a regular expression, we can use Isabelle/HOL's @{const "fold_graph"} and Hilbert's
   571   of such a regular expression, we can use Isabelle/HOL's @{const "fold_graph"} and Hilbert's
   565   @{text "\<epsilon>"} to define @{term "\<Uplus>rs"}. This operation, roughly speaking, folds @{const PLUS} over the 
   572   choice operator, written @{text "SOME"} in Isabelle/HOL, for defining @{term "\<Uplus>rs"}. 
       
   573   This operation, roughly speaking, folds @{const PLUS} over the 
   566   set @{text rs} with @{const ZERO} for the empty set. We can prove that for a finite set @{text rs}
   574   set @{text rs} with @{const ZERO} for the empty set. We can prove that for a finite set @{text rs}
   567   %
   575   %
   568   \begin{equation}\label{uplus}
   576   \begin{equation}\label{uplus}
   569   \mbox{@{thm (lhs) folds_plus_simp} @{text "= \<Union> (\<calL> ` rs)"}}
   577   \mbox{@{thm (lhs) folds_plus_simp} @{text "= \<Union> (\<calL> ` rs)"}}
   570   \end{equation}
   578   \end{equation}
   634 
   642 
   635   \noindent
   643   \noindent
   636   In our running example, @{text "X\<^isub>2"} is the only 
   644   In our running example, @{text "X\<^isub>2"} is the only 
   637   equivalence class in @{term "finals {[c]}"}.
   645   equivalence class in @{term "finals {[c]}"}.
   638   It is straightforward to show that in general 
   646   It is straightforward to show that in general 
   639 
   647   %
   640   \begin{equation}\label{finalprops}
   648   \begin{equation}\label{finalprops}
   641   @{thm lang_is_union_of_finals}\hspace{15mm} 
   649   @{thm lang_is_union_of_finals}\hspace{15mm} 
   642   @{thm finals_in_partitions} 
   650   @{thm finals_in_partitions} 
   643   \end{equation}
   651   \end{equation}
   644 
   652 
   701   equivalence classes and only finitely many characters.
   709   equivalence classes and only finitely many characters.
   702   The term @{text "\<lambda>(ONE)"} in the first equation acts as a marker for the initial state, that
   710   The term @{text "\<lambda>(ONE)"} in the first equation acts as a marker for the initial state, that
   703   is the equivalence class
   711   is the equivalence class
   704   containing the empty string @{text "[]"}.\footnote{Note that we mark, roughly speaking, the
   712   containing the empty string @{text "[]"}.\footnote{Note that we mark, roughly speaking, the
   705   single `initial' state in the equational system, which is different from
   713   single `initial' state in the equational system, which is different from
   706   the method by Brzozowski \cite{Brzozowski64}, where he marks the
   714   the method by \citeN{Brzozowski64}, where he marks the
   707   `terminal' states. We are forced to set up the equational system in our
   715   `terminal' states. We are forced to set up the equational system in our
   708   way, because the Myhill-Nerode Relation determines the `direction' of the
   716   way, because the Myhill-Nerode Relation determines the `direction' of the
   709   transitions---the successor `state' of an equivalence class @{text Y} can
   717   transitions---the successor `state' of an equivalence class @{text Y} can
   710   be reached by adding a character to the end of @{text Y}. This is also the
   718   be reached by adding a character to the end of @{text Y}. This is also the
   711   reason why we have to use our reversed version of Arden's Lemma.}
   719   reason why we have to use our reversed version of Arden's Lemma.}
   712   In our running example we have the initial equational system
   720   In our running example we have the initial equational system
   713 
   721   %
   714   \begin{equation}\label{exmpcs}
   722   \begin{equation}\label{exmpcs}
   715   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
   723   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
   716   @{term "X\<^isub>1"} & @{text "="} & @{text "\<lambda>(ONE)"}\\
   724   @{term "X\<^isub>1"} & @{text "="} & @{text "\<lambda>(ONE)"}\\
   717   @{term "X\<^isub>2"} & @{text "="} & @{text "(X\<^isub>1, ATOM c)"}\\
   725   @{term "X\<^isub>2"} & @{text "="} & @{text "(X\<^isub>1, ATOM c)"}\\
   718   @{term "X\<^isub>3"} & @{text "="} & @{text "(X\<^isub>1, ATOM d\<^isub>1) + \<dots> + (X\<^isub>1, ATOM d\<^isub>n)"}\\
   726   @{term "X\<^isub>3"} & @{text "="} & @{text "(X\<^isub>1, ATOM d\<^isub>1) + \<dots> + (X\<^isub>1, ATOM d\<^isub>n)"}\\
   769   & & @{text "else"}~@{term "{Trn Y (ATOM c)| Y c. Y \<in> CS \<and> Y \<Turnstile>c\<Rightarrow> X}"}\\
   777   & & @{text "else"}~@{term "{Trn Y (ATOM c)| Y c. Y \<in> CS \<and> Y \<Turnstile>c\<Rightarrow> X}"}\\
   770   @{thm (lhs) Init_def}     & @{text "\<equiv>"} & @{thm (rhs) Init_def}
   778   @{thm (lhs) Init_def}     & @{text "\<equiv>"} & @{thm (rhs) Init_def}
   771   \end{tabular}}
   779   \end{tabular}}
   772   \end{equation}
   780   \end{equation}
   773 
   781 
   774   
       
   775 
       
   776   \noindent
   782   \noindent
   777   Because we use sets of terms 
   783   Because we use sets of terms 
   778   for representing the right-hand sides of equations, we can 
   784   for representing the right-hand sides of equations, we can 
   779   prove \eqref{inv1} and \eqref{inv2} more concisely as
   785   prove \eqref{inv1} and \eqref{inv2} more concisely as
   780   %
   786   %
   889   regular expression to @{text "xrhs"} and union it up with @{text rhs'}. When we use
   895   regular expression to @{text "xrhs"} and union it up with @{text rhs'}. When we use
   890   the substitution operation we will arrange it so that @{text "xrhs"} does not contain
   896   the substitution operation we will arrange it so that @{text "xrhs"} does not contain
   891   any occurrence of @{text X}. For example substituting the first equation in
   897   any occurrence of @{text X}. For example substituting the first equation in
   892   \eqref{exmpcs} into the right-hand side of the second, thus eliminating the equivalence 
   898   \eqref{exmpcs} into the right-hand side of the second, thus eliminating the equivalence 
   893   class @{text "X\<^isub>1"}, gives us the equation
   899   class @{text "X\<^isub>1"}, gives us the equation
   894 
   900   %
   895   \begin{equation}\label{exmpresult}
   901   \begin{equation}\label{exmpresult}
   896   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
   902   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
   897   @{term "X\<^isub>2"} & @{text "="} & @{text "\<lambda>(TIMES ONE (ATOM c))"}\\
   903   @{term "X\<^isub>2"} & @{text "="} & @{text "\<lambda>(TIMES ONE (ATOM c))"}
   898   \end{tabular}}
   904   \end{tabular}}
   899   \end{equation}
   905   \end{equation}
   900 
   906 
   901   With these two operations in place, we can define the operation that removes one equation
   907   With these two operations in place, we can define the operation that removes one equation
   902   from an equational systems @{text ES}. The operation @{const Subst_all}
   908   from an equational systems @{text ES}. The operation @{const Subst_all}
   942   @{thm Solve_def}
   948   @{thm Solve_def}
   943   \end{center}
   949   \end{center}
   944 
   950 
   945   \noindent
   951   \noindent
   946   We are not concerned here with the definition of this operator (see
   952   We are not concerned here with the definition of this operator (see
   947   Berghofer and Nipkow \cite{BerghoferNipkow00} for example), but note that we
   953   \cite{BerghoferNipkow00} for example), but note that we
   948   eliminate in each @{const Iter}-step a single equation, and therefore have a
   954   eliminate in each @{const Iter}-step a single equation, and therefore have a
   949   well-founded termination order by taking the cardinality of the equational
   955   well-founded termination order by taking the cardinality of the equational
   950   system @{text ES}. This enables us to prove properties about our definition
   956   system @{text ES}. This enables us to prove properties about our definition
   951   of @{const Solve} when we `call' it with the equivalence class @{text X} and
   957   of @{const Solve} when we `call' it with the equivalence class @{text X} and
   952   the initial equational system @{term "Init (UNIV // \<approx>A)"} from
   958   the initial equational system @{term "Init (UNIV // \<approx>A)"} from
   953   \eqref{initcs} using the principle:
   959   \eqref{initcs} using the principle:
   954 
   960   %
   955   \begin{equation}\label{whileprinciple}
   961   \begin{equation}\label{whileprinciple}
   956   \mbox{\begin{tabular}{l}
   962   \mbox{\begin{tabular}{l}
   957   @{term "invariant (Init (UNIV // \<approx>A))"} \\
   963   @{term "invariant (Init (UNIV // \<approx>A))"} \\
   958   @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> invariant (Iter X ES)"}\\
   964   @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> invariant (Iter X ES)"}\\
   959   @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> card (Iter X ES) < card ES"}\\
   965   @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> card (Iter X ES) < card ES"}\\
  1022   \end{proof}
  1028   \end{proof}
  1023 
  1029 
  1024   \noindent
  1030   \noindent
  1025   Next we show that @{text Iter} preserves the invariant.
  1031   Next we show that @{text Iter} preserves the invariant.
  1026 
  1032 
  1027   \begin{lmm}\label{iterone}
  1033   \begin{lmm}\label{iterone} If
  1028   @{thm[mode=IfThen] iteration_step_invariant[where xrhs="rhs"]}
  1034   @{thm (prem 1) iteration_step_invariant[where xrhs="rhs"]},
       
  1035   @{thm (prem 2) iteration_step_invariant[where xrhs="rhs"]} and
       
  1036   @{thm (prem 3) iteration_step_invariant[where xrhs="rhs"]}, then
       
  1037   @{thm (concl) iteration_step_invariant[where xrhs="rhs"]}.
  1029   \end{lmm}
  1038   \end{lmm}
  1030 
  1039 
  1031   \begin{proof} 
  1040   \begin{proof} 
  1032   The argument boils down to choosing an equation @{text "Y = yrhs"} to be eliminated
  1041   The argument boils down to choosing an equation @{text "Y = yrhs"} to be eliminated
  1033   and to show that @{term "Subst_all (ES - {(Y, yrhs)}) Y (Arden Y yrhs)"} 
  1042   and to show that @{term "Subst_all (ES - {(Y, yrhs)}) Y (Arden Y yrhs)"} 
  1034   preserves the invariant.
  1043   preserves the invariant.
  1035   We prove this as follows:
  1044   We prove this as follows:
  1036 
  1045 
  1037   \begin{center}
  1046   \begin{center}
  1038   \begin{tabular}{@ {}l@ {}}
  1047   \begin{tabular}{@ {}l@ {}}
  1039   @{text "\<forall> ES."}\\ \mbox{}\hspace{5mm}@{thm (prem 1) Subst_all_satisfies_invariant} implies
  1048   @{text "\<forall>ES."}~@{thm (prem 1) Subst_all_satisfies_invariant} implies
  1040   @{thm (concl) Subst_all_satisfies_invariant}
  1049   @{thm (concl) Subst_all_satisfies_invariant}
  1041   \end{tabular}
  1050   \end{tabular}
  1042   \end{center}
  1051   \end{center}
  1043 
  1052 
  1044   \noindent
  1053   \noindent
  1056   \end{proof}
  1065   \end{proof}
  1057 
  1066 
  1058   \noindent
  1067   \noindent
  1059   We also need the fact that @{text Iter} decreases the termination measure.
  1068   We also need the fact that @{text Iter} decreases the termination measure.
  1060 
  1069 
  1061   \begin{lmm}\label{itertwo}\mbox{}\\
  1070   \begin{lmm}\label{itertwo} If
  1062   @{thm[mode=IfThen] iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}
  1071   @{thm (prem 1) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]},
       
  1072   @{thm (prem 2) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]} and
       
  1073   @{thm (prem 3) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}, then\\
       
  1074   \mbox{@{thm (concl) iteration_step_measure[simplified (no_asm), where xrhs="rhs"]}}.
  1063   \end{lmm}
  1075   \end{lmm}
  1064 
  1076 
  1065   \begin{proof}
  1077   \begin{proof}
  1066   By assumption we know that @{text "ES"} is finite and has more than one element.
  1078   By assumption we know that @{text "ES"} is finite and has more than one element.
  1067   Therefore there must be an element @{term "(Y, yrhs) \<in> ES"} with 
  1079   Therefore there must be an element @{term "(Y, yrhs) \<in> ES"} with 
  1148   expressions corresponding to equivalence classes not in @{term "finals A"},
  1160   expressions corresponding to equivalence classes not in @{term "finals A"},
  1149   then we obtain a regular expression for the complement language @{term "- A"}.
  1161   then we obtain a regular expression for the complement language @{term "- A"}.
  1150   This is similar to the usual construction of a `complement automaton'.
  1162   This is similar to the usual construction of a `complement automaton'.
  1151 *}
  1163 *}
  1152 
  1164 
  1153 
       
  1154 
       
  1155 
       
  1156 section {* Myhill-Nerode, Second Part *}
  1165 section {* Myhill-Nerode, Second Part *}
  1157 
  1166 
  1158 text {*
  1167 text {*
  1159   \noindent
  1168   \noindent
  1160   In this section we will give a proof for establishing the second 
  1169   In this section we will give a proof for establishing the second 
  1188   \noindent
  1197   \noindent
  1189   Much more interesting, however, are the inductive cases. They seem hard to be solved 
  1198   Much more interesting, however, are the inductive cases. They seem hard to be solved 
  1190   directly. The reader is invited to try. 
  1199   directly. The reader is invited to try. 
  1191 
  1200 
  1192   In order to see how our proof proceeds consider the following suggestive picture 
  1201   In order to see how our proof proceeds consider the following suggestive picture 
  1193   taken from Constable et al \cite{Constable00}:
  1202   given by \citeN{Constable00}:
  1194 
  1203   %
  1195   \begin{equation}\label{pics}
  1204   \begin{equation}\label{pics}
  1196   \mbox{\begin{tabular}{c@ {\hspace{10mm}}c@ {\hspace{10mm}}c}
  1205   \mbox{\begin{tabular}{c@ {\hspace{10mm}}c@ {\hspace{10mm}}c}
  1197   \begin{tikzpicture}[scale=1]
  1206   \begin{tikzpicture}[scale=0.95]
  1198   %Circle
  1207   %Circle
  1199   \draw[thick] (0,0) circle (1.1);    
  1208   \draw[thick] (0,0) circle (1.1);    
  1200   \end{tikzpicture}
  1209   \end{tikzpicture}
  1201   &
  1210   &
  1202   \begin{tikzpicture}[scale=1]
  1211   \begin{tikzpicture}[scale=0.95]
  1203   %Circle
  1212   %Circle
  1204   \draw[thick] (0,0) circle (1.1);    
  1213   \draw[thick] (0,0) circle (1.1);    
  1205   %Main rays
  1214   %Main rays
  1206   \foreach \a in {0, 90,...,359}
  1215   \foreach \a in {0, 90,...,359}
  1207     \draw[very thick] (0, 0) -- (\a:1.1);
  1216     \draw[very thick] (0, 0) -- (\a:1.1);
  1208   \foreach \a / \l in {45/1, 135/2, 225/3, 315/4}
  1217   \foreach \a / \l in {45/1, 135/2, 225/3, 315/4}
  1209       \draw (\a: 0.65) node {$a_\l$};
  1218       \draw (\a: 0.65) node {$a_\l$};
  1210   \end{tikzpicture}
  1219   \end{tikzpicture}
  1211   &
  1220   &
  1212   \begin{tikzpicture}[scale=1]
  1221   \begin{tikzpicture}[scale=0.95]
  1213   %Circle
  1222   %Circle
  1214   \draw[thick] (0,0) circle (1.1);    
  1223   \draw[thick] (0,0) circle (1.1);    
  1215   %Main rays
  1224   %Main rays
  1216   \foreach \a in {0, 45,...,359}
  1225   \foreach \a in {0, 45,...,359}
  1217      \draw[very thick] (0, 0) -- (\a:1.1);
  1226      \draw[very thick] (0, 0) -- (\a:1.1);
  1356   \noindent
  1365   \noindent
  1357   We therefore can analyse how the strings @{text "x @ z"} are in the language
  1366   We therefore can analyse how the strings @{text "x @ z"} are in the language
  1358   @{text A} and then construct an appropriate tagging-function to infer that
  1367   @{text A} and then construct an appropriate tagging-function to infer that
  1359   @{term "y @ z"} are also in @{text A}.  For this we will use the notion of
  1368   @{term "y @ z"} are also in @{text A}.  For this we will use the notion of
  1360   the set of all possible \emph{partitions} of a string:
  1369   the set of all possible \emph{partitions} of a string:
  1361 
  1370   %
  1362   \begin{equation}
  1371   \begin{equation}
  1363   @{thm Partitions_def}
  1372   @{thm Partitions_def}
  1364   \end{equation}
  1373   \end{equation}
  1365 
  1374 
  1366   \noindent
  1375   \noindent
  1626   the second direction of the Myhill-Nerode Theorem, it is sufficient to find 
  1635   the second direction of the Myhill-Nerode Theorem, it is sufficient to find 
  1627   a more refined relation than @{term "\<approx>(lang r)"} for which we can
  1636   a more refined relation than @{term "\<approx>(lang r)"} for which we can
  1628   show that there are only finitely many equivalence classes. So far we 
  1637   show that there are only finitely many equivalence classes. So far we 
  1629   showed this directly by induction on @{text "r"} using tagging-functions. 
  1638   showed this directly by induction on @{text "r"} using tagging-functions. 
  1630   However, there is also  an indirect method to come up with such a refined 
  1639   However, there is also  an indirect method to come up with such a refined 
  1631   relation by using derivatives of regular expressions~\cite{Brzozowski64}. 
  1640   relation by using derivatives of regular expressions introduced by \citeN{Brzozowski64}. 
  1632 
  1641 
  1633   Assume the following two definitions for the \emph{left-quotient} of a language,
  1642   Assume the following two definitions for the \emph{left-quotient} of a language,
  1634   which we write as @{term "Deriv c A"} and @{term "Derivs s A"} where @{text c}
  1643   which we write as @{term "Deriv c A"} and @{term "Derivs s A"} where @{text c}
  1635   is a character and @{text s} a string, respectively:
  1644   is a character and @{text s} a string, respectively:
  1636 
  1645 
  1650   
  1659   
  1651   \noindent
  1660   \noindent
  1652   where we apply the left-quotient to a set of languages and then combine the results.
  1661   where we apply the left-quotient to a set of languages and then combine the results.
  1653   Clearly we have the following equivalence between the Myhill-Nerode Relation
  1662   Clearly we have the following equivalence between the Myhill-Nerode Relation
  1654   (Definition~\ref{myhillneroderel}) and left-quotients
  1663   (Definition~\ref{myhillneroderel}) and left-quotients
  1655 
  1664   %
  1656   \begin{equation}\label{mhders}
  1665   \begin{equation}\label{mhders}
  1657   @{term "x \<approx>A y"} \hspace{4mm}\text{if and only if}\hspace{4mm} @{term "Derivs x A = Derivs y A"}
  1666   @{term "x \<approx>A y"} \hspace{4mm}\text{if and only if}\hspace{4mm} @{term "Derivs x A = Derivs y A"}
  1658   \end{equation}
  1667   \end{equation}
  1659 
  1668 
  1660   \noindent
  1669   \noindent
  1661   It is also straightforward to establish the following properties of left-quotients
  1670   It is also straightforward to establish the following properties of left-quotients
  1662   
  1671   % 
  1663   \begin{equation}
  1672   \begin{equation}
  1664   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{2mm}}l}
  1673   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{2mm}}l}
  1665   @{thm (lhs) Deriv_simps(1)} & $=$ & @{thm (rhs) Deriv_simps(1)}\\
  1674   @{thm (lhs) Deriv_simps(1)} & $=$ & @{thm (rhs) Deriv_simps(1)}\\
  1666   @{thm (lhs) Deriv_simps(2)} & $=$ & @{thm (rhs) Deriv_simps(2)}\\
  1675   @{thm (lhs) Deriv_simps(2)} & $=$ & @{thm (rhs) Deriv_simps(2)}\\
  1667   @{thm (lhs) Deriv_simps(3)} & $=$ & @{thm (rhs) Deriv_simps(3)}\\
  1676   @{thm (lhs) Deriv_simps(3)} & $=$ & @{thm (rhs) Deriv_simps(3)}\\
  1681   where we use Property~\ref{langprops}@{text "(i)"} in order to infer that
  1690   where we use Property~\ref{langprops}@{text "(i)"} in order to infer that
  1682   @{term "Deriv c (A\<star>) = Deriv c (A \<cdot> A\<star>)"}. We can then complete the proof by
  1691   @{term "Deriv c (A\<star>) = Deriv c (A \<cdot> A\<star>)"}. We can then complete the proof by
  1683   using the fifth equation and noting that @{term "Deriv c (A\<star>) \<subseteq> (Deriv
  1692   using the fifth equation and noting that @{term "Deriv c (A\<star>) \<subseteq> (Deriv
  1684   c A) \<cdot> A\<star>"} provided @{text "[] \<in> A"}. 
  1693   c A) \<cdot> A\<star>"} provided @{text "[] \<in> A"}. 
  1685 
  1694 
  1686   Brzozowski observed that the left-quotients for languages of
  1695   \citeN{Brzozowski64} observed that the left-quotients for languages of
  1687   regular expressions can be calculated directly using the notion of
  1696   regular expressions can be calculated directly using the notion of
  1688   \emph{derivatives of a regular expression} \cite{Brzozowski64}. We define
  1697   \emph{derivatives of a regular expression}. We define
  1689   this notion in Isabelle/HOL as follows:
  1698   this notion in Isabelle/HOL as follows:
  1690 
  1699   %
  1691   \begin{center}
  1700   \begin{center}
  1692   \begin{longtable}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
  1701   \begin{longtable}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
  1693   @{thm (lhs) deriv.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) deriv.simps(1)}\\
  1702   @{thm (lhs) deriv.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) deriv.simps(1)}\\
  1694   @{thm (lhs) deriv.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) deriv.simps(2)}\\
  1703   @{thm (lhs) deriv.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) deriv.simps(2)}\\
  1695   @{thm (lhs) deriv.simps(3)[where c'="d"]}  & @{text "\<equiv>"} & @{thm (rhs) deriv.simps(3)[where c'="d"]}\\
  1704   @{thm (lhs) deriv.simps(3)[where c'="d"]}  & @{text "\<equiv>"} & @{thm (rhs) deriv.simps(3)[where c'="d"]}\\
  1709   \noindent
  1718   \noindent
  1710   The last two clauses extend derivatives from characters to strings. The
  1719   The last two clauses extend derivatives from characters to strings. The
  1711   boolean function @{term "nullable r"} needed in the @{const Times}-case tests
  1720   boolean function @{term "nullable r"} needed in the @{const Times}-case tests
  1712   whether a regular expression can recognise the empty string. It can be defined as 
  1721   whether a regular expression can recognise the empty string. It can be defined as 
  1713   follows.
  1722   follows.
  1714 
  1723   %
  1715   \begin{center}
  1724   \begin{center}
  1716   \begin{tabular}{c}
  1725   \begin{longtable}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
  1717   \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
       
  1718   @{thm (lhs) nullable.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(1)}\\
  1726   @{thm (lhs) nullable.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(1)}\\
  1719   @{thm (lhs) nullable.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(2)}\\
  1727   @{thm (lhs) nullable.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(2)}\\
  1720   @{thm (lhs) nullable.simps(3)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(3)}\\
  1728   @{thm (lhs) nullable.simps(3)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(3)}\\
  1721   @{thm (lhs) nullable.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}  
  1729   @{thm (lhs) nullable.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}  
  1722      & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  1730      & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  1723   @{thm (lhs) nullable.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}  
  1731   @{thm (lhs) nullable.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}  
  1724      & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  1732      & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  1725   @{thm (lhs) nullable.simps(6)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(6)}\\
  1733   @{thm (lhs) nullable.simps(6)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(6)}
  1726   \end{tabular}
  1734   \end{longtable}
  1727   \end{tabular}
       
  1728   \end{center}
  1735   \end{center}
  1729 
  1736 
  1730   \noindent
  1737   \noindent
  1731   By induction on the regular expression @{text r}, respectively on the string @{text s}, 
  1738   By induction on the regular expression @{text r}, respectively on the string @{text s}, 
  1732   one can easily show that left-quotients and derivatives of regular expressions 
  1739   one can easily show that left-quotients and derivatives of regular expressions 
  1733   relate as follows (see for example~\cite{Sakarovitch09}):
  1740   relate as follows (see for example~\cite{Sakarovitch09}):
  1734 
  1741   %
  1735   \begin{equation}\label{Dersders}
  1742   \begin{equation}\label{Dersders}
  1736   \mbox{\begin{tabular}{c}
  1743   \mbox{\begin{tabular}{c}
  1737   @{thm Deriv_deriv}\\
  1744   @{thm Deriv_deriv}\\
  1738   @{thm Derivs_derivs}
  1745   @{thm Derivs_derivs}
  1739   \end{tabular}}
  1746   \end{tabular}}
  1749   @{term "lang (derivs x r) = lang (derivs y r)"}. 
  1756   @{term "lang (derivs x r) = lang (derivs y r)"}. 
  1750   \end{center}
  1757   \end{center}
  1751 
  1758 
  1752   \noindent
  1759   \noindent
  1753   holds and hence
  1760   holds and hence
  1754 
  1761   %
  1755   \begin{equation}
  1762   \begin{equation}
  1756   @{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "derivs x r = derivs y r"}
  1763   @{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "derivs x r = derivs y r"}
  1757   \end{equation}
  1764   \end{equation}
  1758 
  1765 
  1759 
  1766 
  1770   w.r.t.~the language @{text "(ab)\<^isup>\<star> \<union> (ab)\<^isup>\<star>a"}, which is formally 
  1777   w.r.t.~the language @{text "(ab)\<^isup>\<star> \<union> (ab)\<^isup>\<star>a"}, which is formally 
  1771   written in our notation as \mbox{@{text "{[a,b]}\<^isup>\<star> \<union> ({[a,b]}\<^isup>\<star> \<cdot> {[a]})"}}
  1778   written in our notation as \mbox{@{text "{[a,b]}\<^isup>\<star> \<union> ({[a,b]}\<^isup>\<star> \<cdot> {[a]})"}}
  1772   (see \cite[Page~141]{Sakarovitch09}).
  1779   (see \cite[Page~141]{Sakarovitch09}).
  1773 
  1780 
  1774 
  1781 
  1775   What Brzozowski \cite{Brzozowski64} established is that for every language there
  1782   What \citeN{Brzozowski64} established is that for every language there
  1776   \emph{are} only finitely `dissimilar' derivatives for a regular
  1783   \emph{are} only finitely `dissimilar' derivatives for a regular
  1777   expression. Two regular expressions are said to be \emph{similar} provided
  1784   expression. Two regular expressions are said to be \emph{similar} provided
  1778   they can be identified using the using the @{text "ACI"}-identities:
  1785   they can be identified using the using the @{text "ACI"}-identities:
  1779 
  1786   %
  1780 
       
  1781   \begin{equation}\label{ACI}
  1787   \begin{equation}\label{ACI}
  1782   \mbox{\begin{tabular}{cl}
  1788   \mbox{\begin{tabular}{cl}
  1783   (@{text A}) & @{term "Plus (Plus r\<^isub>1 r\<^isub>2) r\<^isub>3"} $\equiv$ @{term "Plus r\<^isub>1 (Plus r\<^isub>2 r\<^isub>3)"}\\
  1789   (@{text A}) & @{term "Plus (Plus r\<^isub>1 r\<^isub>2) r\<^isub>3"} $\equiv$ @{term "Plus r\<^isub>1 (Plus r\<^isub>2 r\<^isub>3)"}\\
  1784   (@{text C}) & @{term "Plus r\<^isub>1 r\<^isub>2"} $\equiv$ @{term "Plus r\<^isub>2 r\<^isub>1"}\\
  1790   (@{text C}) & @{term "Plus r\<^isub>1 r\<^isub>2"} $\equiv$ @{term "Plus r\<^isub>2 r\<^isub>1"}\\
  1785   (@{text I}) & @{term "Plus r r"} $\equiv$ @{term "r"}\\
  1791   (@{text I}) & @{term "Plus r r"} $\equiv$ @{term "r"}\\
  1792   but it is very painful in a theorem prover (since there is no
  1798   but it is very painful in a theorem prover (since there is no
  1793   direct characterisation of the set of dissimilar derivatives).
  1799   direct characterisation of the set of dissimilar derivatives).
  1794 
  1800 
  1795 
  1801 
  1796   Fortunately, there is a much simpler approach using \emph{partial
  1802   Fortunately, there is a much simpler approach using \emph{partial
  1797   derivatives}. They were introduced by Antimirov \cite{Antimirov95} and can be defined
  1803   derivatives}. They were introduced by \citeN{Antimirov95} and can be defined
  1798   in Isabelle/HOL as follows:
  1804   in Isabelle/HOL as follows:
  1799 
  1805   %
  1800   \begin{center}
  1806   \begin{center}
  1801   \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
  1807   \begin{longtable}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
  1802   @{thm (lhs) pderiv.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) pderiv.simps(1)}\\
  1808   @{thm (lhs) pderiv.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) pderiv.simps(1)}\\
  1803   @{thm (lhs) pderiv.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) pderiv.simps(2)}\\
  1809   @{thm (lhs) pderiv.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) pderiv.simps(2)}\\
  1804   @{thm (lhs) pderiv.simps(3)[where c'="d"]}  & @{text "\<equiv>"} & @{thm (rhs) pderiv.simps(3)[where c'="d"]}\\
  1810   @{thm (lhs) pderiv.simps(3)[where c'="d"]}  & @{text "\<equiv>"} & @{thm (rhs) pderiv.simps(3)[where c'="d"]}\\
  1805   @{thm (lhs) pderiv.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}  
  1811   @{thm (lhs) pderiv.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}  
  1806      & @{text "\<equiv>"} & @{thm (rhs) pderiv.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  1812      & @{text "\<equiv>"} & @{thm (rhs) pderiv.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  1809        @{term "(Timess (pderiv c r\<^isub>1) r\<^isub>2) \<union> (pderiv c r\<^isub>2)"}\\
  1815        @{term "(Timess (pderiv c r\<^isub>1) r\<^isub>2) \<union> (pderiv c r\<^isub>2)"}\\
  1810      & & \phantom{@{text "if"}~@{term "nullable r\<^isub>1"}~}@{text "else"}~%  
  1816      & & \phantom{@{text "if"}~@{term "nullable r\<^isub>1"}~}@{text "else"}~%  
  1811                     @{term "Timess (pderiv c r\<^isub>1) r\<^isub>2"}\\ 
  1817                     @{term "Timess (pderiv c r\<^isub>1) r\<^isub>2"}\\ 
  1812   @{thm (lhs) pderiv.simps(6)}  & @{text "\<equiv>"} & @{thm (rhs) pderiv.simps(6)}\smallskip\\
  1818   @{thm (lhs) pderiv.simps(6)}  & @{text "\<equiv>"} & @{thm (rhs) pderiv.simps(6)}\smallskip\\
  1813   @{thm (lhs) pderivs.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) pderivs.simps(1)}\\
  1819   @{thm (lhs) pderivs.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) pderivs.simps(1)}\\
  1814   @{thm (lhs) pderivs.simps(2)}  & @{text "\<equiv>"} & @{text "\<Union> (pders s) ` (pder c r)"}\\
  1820   @{thm (lhs) pderivs.simps(2)}  & @{text "\<equiv>"} & @{text "\<Union> (pders s) ` (pder c r)"}
  1815   \end{tabular}
  1821   \end{longtable}
  1816   \end{center}
  1822   \end{center}
  1817 
  1823 
  1818   \noindent
  1824   \noindent
  1819   Again the last two clauses extend partial derivatives from characters to strings. 
  1825   Again the last two clauses extend partial derivatives from characters to strings. 
  1820   Unlike `simple' derivatives, the functions for partial derivatives return sets of regular
  1826   Unlike `simple' derivatives, the functions for partial derivatives return sets of regular
  1846   \end{center}
  1852   \end{center}
  1847 
  1853 
  1848   Partial derivatives can be seen as having the @{text "ACI"}-identities already built in: 
  1854   Partial derivatives can be seen as having the @{text "ACI"}-identities already built in: 
  1849   taking the partial derivatives of the
  1855   taking the partial derivatives of the
  1850   regular expressions in \eqref{ACI} gives us in each case
  1856   regular expressions in \eqref{ACI} gives us in each case
  1851   equal sets.  Antimirov \cite{Antimirov95} showed a similar result to
  1857   equal sets.  \citeN{Antimirov95} showed a similar result to
  1852   \eqref{Dersders} for partial derivatives, namely
  1858   \eqref{Dersders} for partial derivatives, namely
  1853 
  1859   %
  1854   \begin{equation}\label{Derspders}
  1860   \begin{equation}\label{Derspders}
  1855   \mbox{\begin{tabular}{lc}
  1861   \mbox{\begin{tabular}{lc}
  1856   @{text "(i)"}  & @{thm Deriv_pderiv}\\
  1862   @{text "(i)"}  & @{thm Deriv_pderiv}\\
  1857   @{text "(ii)"} & @{thm Derivs_pderivs}
  1863   @{text "(ii)"} & @{thm Derivs_pderivs}
  1858   \end{tabular}}
  1864   \end{tabular}}
  1889   \end{proof}
  1895   \end{proof}
  1890 
  1896 
  1891   \noindent
  1897   \noindent
  1892   Taking \eqref{Dersders} and \eqref{Derspders} together gives the relationship 
  1898   Taking \eqref{Dersders} and \eqref{Derspders} together gives the relationship 
  1893   between languages of derivatives and partial derivatives
  1899   between languages of derivatives and partial derivatives
  1894 
  1900   %
  1895   \begin{equation}
  1901   \begin{equation}
  1896   \mbox{\begin{tabular}{lc}
  1902   \mbox{\begin{tabular}{lc}
  1897   @{text "(i)"}  & @{thm deriv_pderiv[symmetric]}\\
  1903   @{text "(i)"}  & @{thm deriv_pderiv[symmetric]}\\
  1898   @{text "(ii)"} & @{thm derivs_pderivs[symmetric]}
  1904   @{text "(ii)"} & @{thm derivs_pderivs[symmetric]}
  1899   \end{tabular}}
  1905   \end{tabular}}
  1905   of derivatives already built in. 
  1911   of derivatives already built in. 
  1906 
  1912 
  1907   Antimirov also proved that for every language and every regular expression 
  1913   Antimirov also proved that for every language and every regular expression 
  1908   there are only finitely many partial derivatives, whereby the set of partial
  1914   there are only finitely many partial derivatives, whereby the set of partial
  1909   derivatives of @{text r} w.r.t.~a language @{text A} is defined as
  1915   derivatives of @{text r} w.r.t.~a language @{text A} is defined as
  1910 
  1916   %
  1911   \begin{equation}\label{Pdersdef}
  1917   \begin{equation}\label{Pdersdef}
  1912   @{thm pderivs_lang_def}
  1918   @{thm pderivs_lang_def}
  1913   \end{equation}
  1919   \end{equation}
  1914 
  1920 
  1915   \begin{thrm}[(Antimirov \cite{Antimirov95})]\label{antimirov}
  1921   \begin{thrm}[\cite{Antimirov95}]\label{antimirov}
  1916   For every language @{text A} and every regular expression @{text r}, 
  1922   For every language @{text A} and every regular expression @{text r}, 
  1917   \mbox{@{thm finite_pderivs_lang}}.
  1923   \mbox{@{thm finite_pderivs_lang}}.
  1918   \end{thrm}
  1924   \end{thrm}
  1919 
  1925 
  1920   \noindent
  1926   \noindent
  1921   Antimirov's proof first establishes this theorem for the language @{term UNIV1}, 
  1927   Antimirov's proof first establishes this theorem for the language @{term UNIV1}, 
  1922   which is the set of all non-empty strings. For this he proves:
  1928   which is the set of all non-empty strings. For this he proves:
  1923 
  1929   %
  1924   \begin{equation}\label{pdersunivone}
  1930   \begin{equation}\label{pdersunivone}
  1925   \mbox{\begin{tabular}{l}
  1931   \mbox{\begin{tabular}{l}
  1926   @{thm pderivs_lang_Zero}\\
  1932   @{thm pderivs_lang_Zero}\\
  1927   @{thm pderivs_lang_One}\\
  1933   @{thm pderivs_lang_One}\\
  1928   @{thm pderivs_lang_Atom}\\
  1934   @{thm pderivs_lang_Atom}\\
  2015   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
  2021   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
  2016   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
  2022   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
  2017   regular expression. Clearly not something you want to formalise in a theorem
  2023   regular expression. Clearly not something you want to formalise in a theorem
  2018   prover if it is cumbersome to reason about automata.
  2024   prover if it is cumbersome to reason about automata.
  2019 
  2025 
  2020   Once closure under complement is established, closure under intersection
  2026   %Once closure under complement is established, closure under intersection
  2021   and set difference is also easy, because
  2027   %and set difference is also easy, because
  2022 
  2028   %
  2023   \begin{center}
  2029   %\begin{center}
  2024   \begin{tabular}{c}
  2030   %\begin{tabular}{c}
  2025   @{term "A \<inter> B = - (- A \<union> - B)"}\\
  2031   %@{term "A \<inter> B = - (- A \<union> - B)"}\\
  2026   @{term "A - B = - (- A \<union> B)"}
  2032   %@{term "A - B = - (- A \<union> B)"}
  2027   \end{tabular}
  2033   %\end{tabular}
  2028   \end{center}
  2034   %\end{center}
  2029 
  2035   %
  2030   \noindent
  2036   %\noindent
  2031   Since all finite languages are regular, then by closure under complement also
  2037   %Since all finite languages are regular, then by closure under complement also
  2032   all co-finite languages. Closure of regular languages under reversal, that is
  2038   %all co-finite languages. 
  2033 
  2039   %
  2034   \begin{center}
  2040   %Closure of regular languages under reversal, that is
  2035   @{text "A\<^bsup>-1\<^esup> \<equiv> {s\<^bsup>-1\<^esup> | s \<in> A}"}
  2041   %
  2036   \end{center}
  2042   %\begin{center}
  2037 
  2043   %@{text "A\<^bsup>-1\<^esup> \<equiv> {s\<^bsup>-1\<^esup> | s \<in> A}"}
  2038   \noindent 
  2044   %\end{center}
  2039   can be shown with the help of the following operation defined recursively over 
  2045   %
  2040   regular expressions
  2046   %\noindent 
  2041 
  2047   %can be shown with the help of the following operation defined recursively over 
  2042   \begin{center}
  2048   %regular expressions
  2043   \begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
  2049   %
  2044   @{thm (lhs) Rev.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(1)}\\
  2050   %\begin{center}
  2045   @{thm (lhs) Rev.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(2)}\\
  2051   %\begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
  2046   @{thm (lhs) Rev.simps(3)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(3)}\\
  2052   %@{thm (lhs) Rev.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(1)}\\
  2047   @{thm (lhs) Rev.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} & @{text "\<equiv>"} & 
  2053   %@{thm (lhs) Rev.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(2)}\\
  2048       @{thm (rhs) Rev.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  2054   %@{thm (lhs) Rev.simps(3)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(3)}\\
  2049   @{thm (lhs) Rev.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} & @{text "\<equiv>"} & 
  2055   %@{thm (lhs) Rev.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} & @{text "\<equiv>"} & 
  2050       @{thm (rhs) Rev.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  2056   %    @{thm (rhs) Rev.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  2051   @{thm (lhs) Rev.simps(6)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(6)}\\
  2057   %@{thm (lhs) Rev.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} & @{text "\<equiv>"} & 
  2052   \end{tabular}
  2058   %    @{thm (rhs) Rev.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  2053   \end{center}
  2059   %@{thm (lhs) Rev.simps(6)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(6)}\\
  2054 
  2060   %\end{tabular}
  2055   \noindent
  2061   %\end{center}
  2056   For this operation we can show
  2062   %
  2057 
  2063   %\noindent
  2058   \begin{center}
  2064   %For this operation we can show
  2059   @{text "(\<calL>(r))\<^bsup>-1\<^esup>"}~@{text "="}~@{thm (rhs) rev_lang}
  2065   %
  2060   \end{center}
  2066   %\begin{center}
  2061   
  2067   %@{text "(\<calL>(r))\<^bsup>-1\<^esup>"}~@{text "="}~@{thm (rhs) rev_lang}
  2062   \noindent
  2068   %\end{center}
  2063   from which closure under reversal of regular languages follows.
  2069   %
       
  2070   %\noindent
       
  2071   %from which closure under reversal of regular languages follows.
  2064 
  2072 
  2065   A perhaps surprising fact is that regular languages are closed under any
  2073   A perhaps surprising fact is that regular languages are closed under any
  2066   left-quotient. Define
  2074   left-quotient. Define
  2067 
  2075 
  2068   \begin{center}
  2076   \begin{center}
  2075   using partial derivatives: From @{text A} being regular we know there exists
  2083   using partial derivatives: From @{text A} being regular we know there exists
  2076   a regular expression @{text r} such that @{term "A = lang r"}. We also know
  2084   a regular expression @{text r} such that @{term "A = lang r"}. We also know
  2077   that @{term "pderivs_lang B r"} is finite for every language @{text B} and 
  2085   that @{term "pderivs_lang B r"} is finite for every language @{text B} and 
  2078   regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition 
  2086   regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition 
  2079   and \eqref{Derspders} we have
  2087   and \eqref{Derspders} we have
  2080 
  2088   %
  2081   
       
  2082   \begin{equation}\label{eqq}
  2089   \begin{equation}\label{eqq}
  2083   @{term "Deriv_lang B (lang r) = (\<Union> lang ` (pderivs_lang B r))"}
  2090   @{term "Deriv_lang B (lang r) = (\<Union> lang ` (pderivs_lang B r))"}
  2084   \end{equation}
  2091   \end{equation}
  2085 
  2092 
  2086   \noindent
  2093   \noindent
  2088   B r"}, we know by \eqref{uplus} that there exists a regular expression so that
  2095   B r"}, we know by \eqref{uplus} that there exists a regular expression so that
  2089   the right-hand side of \eqref{eqq} is equal to the language \mbox{@{term "lang (\<Uplus>(pderivs_lang B
  2096   the right-hand side of \eqref{eqq} is equal to the language \mbox{@{term "lang (\<Uplus>(pderivs_lang B
  2090   r))"}}. Thus the regular expression @{term "\<Uplus>(pderivs_lang B r)"} verifies that
  2097   r))"}}. Thus the regular expression @{term "\<Uplus>(pderivs_lang B r)"} verifies that
  2091   @{term "Deriv_lang B A"} is regular.
  2098   @{term "Deriv_lang B A"} is regular.
  2092 
  2099 
  2093   Even more surprising is the fact that for \emph{every} language @{text A}, the language
  2100   Even more surprising is the fact given first by \citeN{Haines69} stating 
  2094   consisting of all (scattered) substrings of @{text A} is regular \cite{Haines69} (see also 
  2101   that for \emph{every} language @{text A}, the language
       
  2102   consisting of all (scattered) substrings of @{text A} is regular  (see also 
  2095   \cite{Shallit08,Gasarch09}). 
  2103   \cite{Shallit08,Gasarch09}). 
  2096   A \emph{(scattered) substring} can be obtained
  2104   A \emph{(scattered) substring} can be obtained
  2097   by striking out zero or more characters from a string. This can be defined 
  2105   by striking out zero or more characters from a string. This can be defined 
  2098   inductively in Isabelle/HOL by the following three rules:
  2106   inductively in Isabelle/HOL by the following three rules:
  2099 
  2107 
  2100   \begin{center}
  2108   %\begin{center}
  2101   @{thm[mode=Axiom] emb0[where bs="x"]}\hspace{10mm} 
  2109   %@ {thm[mode=Axiom] emb0[where bs="x"]}\hspace{10mm} 
  2102   @{thm[mode=Rule] emb1[where as="x" and b="c" and bs="y"]}\hspace{10mm} 
  2110   %@ {thm[mode=Rule] emb1[where as="x" and b="c" and bs="y"]}\hspace{10mm} 
  2103   @{thm[mode=Rule] emb2[where as="x" and a="c" and bs="y"]}
  2111   %@ {thm[mode=Rule] emb2[where as="x" and a="c" and bs="y"]}
       
  2112   %\end{center}
       
  2113   \begin{center}
       
  2114   @{thm[mode=Axiom] emb0}\hspace{10mm} 
       
  2115   @{thm[mode=Rule] emb1}\hspace{10mm} 
       
  2116   @{thm[mode=Rule] emb2}
  2104   \end{center}
  2117   \end{center}
  2105 
  2118 
  2106   \noindent
  2119   \noindent
  2107   It is straightforward to prove that @{text "\<preceq>"} is a partial order. Now define the 
  2120   It is straightforward to prove that @{text "\<preceq>"} is a partial order. Now define the 
  2108   \emph{language of substrings} and \emph{superstrings} of a language @{text A} 
  2121   \emph{language of substrings} and \emph{superstrings} of a language @{text A} 
  2116   \end{center}
  2129   \end{center}
  2117   
  2130   
  2118   \noindent
  2131   \noindent
  2119   We like to establish
  2132   We like to establish
  2120 
  2133 
  2121   \begin{thrm}[(Haines \cite{Haines69})]\label{subseqreg}
  2134   \begin{thrm}[\cite{Haines69}]\label{subseqreg}
  2122   For every language @{text A}, the languages @{text "(i)"} @{term "SUBSEQ A"} and 
  2135   For every language @{text A}, the languages @{text "(i)"} @{term "SUBSEQ A"} and 
  2123   @{text "(ii)"} @{term "SUPSEQ A"}
  2136   @{text "(ii)"} @{term "SUPSEQ A"}
  2124   are regular.
  2137   are regular.
  2125   \end{thrm}
  2138   \end{thrm}
  2126 
  2139 
  2127   \noindent
  2140   \noindent
  2128   Our proof follows the one given in \cite[Pages 92--95]{Shallit08}, except that we use
  2141   Our proof follows the one given by \citeN[Pages 92--95]{Shallit08}, except that we use
  2129   Higman's Lemma, which is already proved in the Isabelle/HOL library 
  2142   Higman's Lemma, which is already proved in the Isabelle/HOL library by
  2130   \cite{Berghofer03}.\footnote{Unfortunately, Berghofer's formalisation of Higman's Lemma 
  2143   Sternagel.
  2131   is restricted to 2-letter alphabets,
       
  2132   which means also our formalisation of Theorem~\ref{subseqreg} is `tainted' with 
       
  2133   this constraint. However our methodology is applicable to any alphabet of finite size.} 
       
  2134   Higman's Lemma allows us to infer that every language @{text A} of antichains, satisfying
  2144   Higman's Lemma allows us to infer that every language @{text A} of antichains, satisfying
  2135 
  2145   %
  2136   \begin{equation}\label{higman}
  2146   \begin{equation}\label{higman}
  2137   @{text "\<forall>x, y \<in> A."}~@{term "x \<noteq> y \<longrightarrow> \<not>(x \<preceq> y) \<and> \<not>(y \<preceq> x)"}
  2147   @{text "\<forall>x, y \<in> A."}~@{term "x \<noteq> y \<longrightarrow> \<not>(x \<preceq> y) \<and> \<not>(y \<preceq> x)"}
  2138   \end{equation} 
  2148   \end{equation} 
  2139 
  2149 
  2140   \noindent
  2150   \noindent
  2141   is finite.
  2151   is finite.
  2142 
  2152 
  2143   The first step in our proof of Theorem~\ref{subseqreg} is to establish the 
  2153   The first step in our proof of Theorem~\ref{subseqreg} is to establish the 
  2144   following simple properties for @{term SUPSEQ}
  2154   following simple properties for @{term SUPSEQ}
  2145 
  2155   %
  2146   \begin{equation}\label{supseqprops}
  2156   \begin{equation}\label{supseqprops}
  2147   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
  2157   \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
  2148   @{thm (lhs) SUPSEQ_simps(1)} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_simps(1)}\\  
  2158   @{thm (lhs) SUPSEQ_simps(1)} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_simps(1)}\\  
  2149   @{thm (lhs) SUPSEQ_simps(2)} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_simps(2)}\\ 
  2159   @{thm (lhs) SUPSEQ_simps(2)} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_simps(2)}\\ 
  2150   @{thm (lhs) SUPSEQ_atom} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_atom}\\ 
  2160   @{thm (lhs) SUPSEQ_atom} & @{text "\<equiv>"} & @{thm (rhs) SUPSEQ_atom}\\ 
  2211   the other direction we have  @{term "x \<in> SUPSEQ A"}. From this we obtain 
  2221   the other direction we have  @{term "x \<in> SUPSEQ A"}. From this we obtain 
  2212   a @{text y} such that @{term "y \<in> A"} and @{term "y \<preceq> x"}. Since we have that
  2222   a @{text y} such that @{term "y \<in> A"} and @{term "y \<preceq> x"}. Since we have that
  2213   the relation \mbox{@{term "{(y, x). y \<preceq> x \<and> x \<noteq> y}"}} is well-founded, there must
  2223   the relation \mbox{@{term "{(y, x). y \<preceq> x \<and> x \<noteq> y}"}} is well-founded, there must
  2214   be a minimal element @{text "z"} such that @{term "z \<in> A"} and @{term "z \<preceq> y"}, 
  2224   be a minimal element @{text "z"} such that @{term "z \<in> A"} and @{term "z \<preceq> y"}, 
  2215   and hence by transitivity also \mbox{@{term "z \<preceq> x"}} (here we deviate from the argument 
  2225   and hence by transitivity also \mbox{@{term "z \<preceq> x"}} (here we deviate from the argument 
  2216   given in \cite{Shallit08}, because Isabelle/HOL provides already an extensive infrastructure
  2226   given by \citeN{Shallit08}, because Isabelle/HOL provides already an extensive infrastructure
  2217   for reasoning about well-foundedness). Since @{term "z"} is
  2227   for reasoning about well-foundedness). Since @{term "z"} is
  2218   minimal and an element in @{term A}, we also know that @{term z} is in @{term M}.
  2228   minimal and an element in @{term A}, we also know that @{term z} is in @{term M}.
  2219   From this together with \mbox{@{term "z \<preceq> x"}}, we can infer that @{term x} is in 
  2229   From this together with \mbox{@{term "z \<preceq> x"}}, we can infer that @{term x} is in 
  2220   @{term "SUPSEQ M"}, as required.
  2230   @{term "SUPSEQ M"}, as required.
  2221   \end{proof}
  2231   \end{proof}
  2229   which establishes the second part.    
  2239   which establishes the second part.    
  2230   \end{proof}
  2240   \end{proof}
  2231 
  2241 
  2232   \noindent
  2242   \noindent
  2233   In order to establish the first part of this theorem, we use the
  2243   In order to establish the first part of this theorem, we use the
  2234   property proved in \cite{Shallit08}, namely that
  2244   property proved by \citeN{Shallit08}, namely that
  2235 
  2245   %
  2236   \begin{equation}\label{compl}
  2246   \begin{equation}\label{compl}
  2237   @{thm SUBSEQ_complement}
  2247   @{thm SUBSEQ_complement}
  2238   \end{equation}
  2248   \end{equation}
  2239 
  2249 
  2240   \noindent
  2250   \noindent
  2300   expressions can conveniently be defined as a datatype in theorem
  2310   expressions can conveniently be defined as a datatype in theorem
  2301   provers. For us it was therefore interesting to find out how far we can push
  2311   provers. For us it was therefore interesting to find out how far we can push
  2302   this point of view. We have established in Isabelle/HOL both directions 
  2312   this point of view. We have established in Isabelle/HOL both directions 
  2303   of the Myhill-Nerode Theorem.
  2313   of the Myhill-Nerode Theorem.
  2304   %
  2314   %
  2305   \begin{thrm}[(The Myhill-Nerode Theorem)]\mbox{}\\
  2315   \begin{thrm}[(Myhill-Nerode Theorem)]\mbox{}\\
  2306   A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
  2316   A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
  2307   \end{thrm}
  2317   \end{thrm}
  2308   
  2318   
  2309   \noindent
  2319   \noindent
  2310   Having formalised this theorem means we pushed our point of view quite
  2320   Having formalised this theorem means we pushed our point of view quite
  2311   far. Using this theorem we can obviously prove when a language is \emph{not}
  2321   far. Using this theorem we can obviously prove when a language is \emph{not}
  2312   regular---by establishing that it has infinitely many equivalence classes
  2322   regular---by establishing that it has infinitely many equivalence classes
  2313   generated by the Myhill-Nerode Relation (this is usually the purpose of the
  2323   generated by the Myhill-Nerode Relation (this is usually the purpose of the
  2314   Pumping Lemma \cite{Kozen97}).  We can also use it to establish the standard
  2324   Pumping Lemma).  We can also use it to establish the standard
  2315   textbook results about closure properties of regular languages. Interesting
  2325   textbook results about closure properties of regular languages. Interesting
  2316   is the case of closure under complement, because it seems difficult to
  2326   is the case of closure under complement, because it seems difficult to
  2317   construct a regular expression for the complement language by direct
  2327   construct a regular expression for the complement language by direct
  2318   means. However the existence of such a regular expression can be easily
  2328   means. However the existence of such a regular expression can be easily
  2319   proved using the Myhill-Nerode Theorem.  
  2329   proved using the Myhill-Nerode Theorem.  
  2320 
  2330 
  2321   Our insistence on regular expressions for proving the Myhill-Nerode Theorem
  2331   %Our insistence on regular expressions for proving the Myhill-Nerode Theorem
  2322   arose from the problem of defining formally the regularity of a language.
  2332   %arose from the problem of defining formally the regularity of a language.
  2323   In order to guarantee consistency,
  2333   %In order to guarantee consistency,
  2324   formalisations in HOL can only extend the logic with definitions that introduce a new concept in
  2334   %formalisations in HOL can only extend the logic with definitions that introduce a new concept in
  2325   terms of already existing notions. A convenient definition for automata
  2335   %terms of already existing notions. A convenient definition for automata
  2326   (based on graphs) uses a polymorphic type for the state nodes. This allows
  2336   %(based on graphs) uses a polymorphic type for the state nodes. This allows
  2327   us to use the standard operation for disjoint union whenever we need to compose two
  2337   %us to use the standard operation for disjoint union whenever we need to compose two
  2328   automata. Unfortunately, we cannot use such a polymorphic definition
  2338   %automata. Unfortunately, we cannot use such a polymorphic definition
  2329   in HOL as part of the definition for regularity of a language (a predicate
  2339   %in HOL as part of the definition for regularity of a language (a predicate
  2330   over sets of strings).  Consider for example the following attempt:
  2340   %over sets of strings).  Consider for example the following attempt:
  2331 
  2341   %
  2332   \begin{center}
  2342   %\begin{center}
  2333   @{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_dfa (M) \<and> \<calL>(M) = A"}
  2343   %@{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_dfa (M) \<and> \<calL>(M) = A"}
  2334   \end{center}
  2344   %\end{center}
  2335 
  2345   %
  2336   \noindent
  2346   %\noindent
  2337   In this definifion, the definiens is polymorphic in the type of the automata
  2347   %In this definifion, the definiens is polymorphic in the type of the automata
  2338   @{text "M"} (indicated by dependency on the type-variable @{text "\<alpha>"}), but the definiendum
  2348   %@{text "M"} (indicated by dependency on the type-variable @{text "\<alpha>"}), but the definiendum
  2339   @{text "is_regular"} is not. Such definitions are excluded from HOL, because
  2349   %@{text "is_regular"} is not. Such definitions are excluded from HOL, because
  2340   they can lead easily to inconsistencies (see \cite{PittsHOL4} for a simple
  2350   %they can lead easily to inconsistencies (see \cite{PittsHOL4} for a simple
  2341   example). Also HOL does not contain type-quantifiers which would allow us to
  2351   %example). Also HOL does not contain type-quantifiers which would allow us to
  2342   get rid of the polymorphism by quantifying over the type-variable @{text
  2352   %get rid of the polymorphism by quantifying over the type-variable 
  2343   "\<alpha>"}. Therefore when defining regularity in terms of automata, the
  2353   %@{text "\<alpha>"}. Therefore when defining regularity in terms of automata, the
  2344   natural way out in HOL is to resort to state nodes with an identity, for
  2354   %natural way out in HOL is to resort to state nodes with an identity, for
  2345   example a natural number. Unfortunatly, the consequence is that we have to
  2355   %example a natural number. Unfortunatly, the consequence is that we have to
  2346   be careful when combining two automata so that there is no clash between two
  2356   %be careful when combining two automata so that there is no clash between two
  2347   such states. This makes formalisations quite fiddly and rather
  2357   %such states. This makes formalisations quite fiddly and rather
  2348   unpleasant. Regular expressions proved much more convenient for reasoning in
  2358   %unpleasant. Regular expressions proved much more convenient for reasoning in
  2349   HOL since they can be defined as inductive datatype and a reasoning
  2359   %HOL since they can be defined as inductive datatype and a reasoning
  2350   infrastructure comes for free. The definition of regularity in terms of
  2360   %infrastructure comes for free. The definition of regularity in terms of
  2351   regular expressions poses no problem at all for HOL.  We showed in this
  2361   %regular expressions poses no problem at all for HOL.  We showed in this
  2352   paper that they can be used for establishing the central result in regular
  2362   %paper that they can be used for establishing the central result in regular
  2353   language theory---the Myhill-Nerode Theorem.
  2363   %language theory---the Myhill-Nerode Theorem.
  2354 
  2364 
  2355   While regular expressions are convenient, they have some limitations. One is
  2365   While regular expressions are convenient, they have some limitations. One is
  2356   that there seems to be no method of calculating a minimal regular expression
  2366   that there seems to be no method of calculating a minimal regular expression
  2357   (for example in terms of length) for a regular language, like there is for
  2367   (for example in terms of length) for a regular language, like there is for
  2358   automata. On the other hand, efficient regular expression matching, without
  2368   automata. On the other hand, efficient regular expression matching, without
  2359   using automata, poses no problem \cite{OwensReppyTuron09}.  For an
  2369   using automata, poses no problem as shown by \citeN{OwensReppyTuron09}.  For an
  2360   implementation of a simple regular expression matcher, whose correctness has
  2370   implementation of a simple regular expression matcher, whose correctness has
  2361   been formally established, we refer the reader to Owens and Slind
  2371   been formally established, we refer the reader to 
  2362   \cite{OwensSlind08}. In our opinion, their formalisation is considerably
  2372   \citeN{OwensSlind08}. In our opinion, their formalisation is considerably
  2363   slicker than for example the approach to regular expression matching taken
  2373   slicker than for example the approach to regular expression matching taken
  2364   in \cite{Harper99} and \cite{Yi06}.
  2374   by \citeN{Harper99} and by \citeN{Yi06}.
  2365 
  2375 
  2366   Our proof of the first direction is very much inspired by \emph{Brzozowski's
  2376   Our proof of the first direction is very much inspired by \emph{Brzozowski's
  2367   algebraic method} used to convert a finite automaton to a regular expression
  2377   algebraic method} \citeyear{Brzozowski64} used to convert a finite automaton to a regular expression. 
  2368   \cite{Brzozowski64}. The close connection can be seen by considering the
  2378   The close connection can be seen by considering the
  2369   equivalence classes as the states of the minimal automaton for the regular
  2379   equivalence classes as the states of the minimal automaton for the regular
  2370   language.  However there are some subtle differences. Because our equivalence 
  2380   language.  However there are some subtle differences. Because our equivalence 
  2371   classes (or correspondingly states) arise from the Myhill-Nerode Relation, the most natural
  2381   classes (or correspondingly states) arise from the Myhill-Nerode Relation, the most natural
  2372   choice is to characterise each state with the set of strings starting from
  2382   choice is to characterise each state with the set of strings starting from
  2373   the initial state leading up to that state. Usually, however, the states are
  2383   the initial state leading up to that state. Usually, however, the states are
  2381 
  2391 
  2382   We presented two proofs for the second direction of the Myhill-Nerode
  2392   We presented two proofs for the second direction of the Myhill-Nerode
  2383   Theorem. One direct proof using tagging-functions and another using partial
  2393   Theorem. One direct proof using tagging-functions and another using partial
  2384   derivatives. This part of our work is where our method using regular
  2394   derivatives. This part of our work is where our method using regular
  2385   expressions shines, because we can completely side-step the standard
  2395   expressions shines, because we can completely side-step the standard
  2386   argument \cite{Kozen97} where automata need to be composed. However, it is
  2396   argument (for example used by \citeN{Kozen97}) where automata need to be composed. However, it is
  2387   also the direction where we had to spend most of the `conceptual' time, as
  2397   also the direction where we had to spend most of the `conceptual' time, as
  2388   our first proof based on tagging-functions is new for establishing the
  2398   our first proof based on tagging-functions is new for establishing the
  2389   Myhill-Nerode Theorem. All standard proofs of this direction proceed by
  2399   Myhill-Nerode Theorem. All standard proofs of this direction proceed by
  2390   arguments over automata.
  2400   arguments over automata.
  2391   
  2401   
  2392   The indirect proof for the second direction arose from our interest in
  2402   The indirect proof for the second direction arose from our interest in
  2393   Brzozowski's derivatives for regular expression matching.  While Brzozowski 
  2403   Brzozowski's derivatives for regular expression matching.  While \citeN{Brzozowski64} 
  2394   already established that there are only
  2404   already established that there are only
  2395   finitely many dissimilar derivatives for every regular expression, this
  2405   finitely many dissimilar derivatives for every regular expression, this
  2396   result is not as straightforward to formalise in a theorem prover as one
  2406   result is not as straightforward to formalise in a theorem prover as one
  2397   might wish. The reason is that the set of dissimilar derivatives is not
  2407   might wish. The reason is that the set of dissimilar derivatives is not
  2398   defined inductively, but in terms of an ACI-equivalence relation. This
  2408   defined inductively, but in terms of an ACI-equivalence relation. This
  2399   difficulty prevented for example Krauss and Nipkow to prove termination of
  2409   difficulty prevented for example \citeN{KraussNipkow11} to prove termination of
  2400   their equivalence checker for regular expressions
  2410   their equivalence checker for regular expressions. Their checker is based on Brzozowski's derivatives
  2401   \cite{KraussNipkow11}. Their checker is based on Brzozowski's derivatives
       
  2402   and for their argument the lack of a formal proof of termination is not
  2411   and for their argument the lack of a formal proof of termination is not
  2403   crucial (it merely lets them ``sleep better'' \cite{KraussNipkow11}).  We
  2412   crucial (it merely lets them ``sleep better'' \cite{KraussNipkow11}).  We
  2404   expect that their development simplifies by using partial derivatives,
  2413   expect that their development simplifies by using partial derivatives,
  2405   instead of derivatives, and that the termination of the algorithm can be
  2414   instead of derivatives, and that the termination of the algorithm can be
  2406   formally established (the main ingredient is
  2415   formally established (the main ingredient is
  2407   Theorem~\ref{antimirov}). However, since partial derivatives use sets of
  2416   Theorem~\ref{antimirov}). However, since partial derivatives use sets of
  2408   regular expressions, one needs to carefully analyse whether the resulting
  2417   regular expressions, one needs to carefully analyse whether the resulting
  2409   algorithm is still executable. Given the existing infrastructure for
  2418   algorithm is still executable. Given the infrastructure for
  2410   executable sets in Isabelle/HOL \cite{Haftmann09}, it should.
  2419   executable sets  introduced by \citeN{Haftmann09} in Isabelle/HOL, it should.
  2411 
  2420 
  2412   Our formalisation of the Myhill-Nerode Theorem consists of 780 lines of
  2421   Our formalisation of the Myhill-Nerode Theorem consists of 780 lines of
  2413   Isabelle/Isar code for the first direction and 460 for the second (the one
  2422   Isabelle/Isar code for the first direction and 460 for the second (the one
  2414   based on tagging-functions), plus around 300 lines of standard material
  2423   based on tagging-functions), plus around 300 lines of standard material
  2415   about regular languages. The formalisation of derivatives and partial
  2424   about regular languages. The formalisation of derivatives and partial
  2416   derivatives shown in Section~\ref{derivatives} consists of 390 lines of
  2425   derivatives shown in Section~\ref{derivatives} consists of 390 lines of
  2417   code.  The closure properties in Section~\ref{closure} (except Theorem~\ref{subseqreg}) 
  2426   code.  The closure properties in Section~\ref{closure} (except Theorem~\ref{subseqreg}) 
  2418   can be established in
  2427   can be established in
  2419   190 lines of code. The Continuation Lemma and the non-regularity of @{text "a\<^sup>n b\<^sup>n"} 
  2428   100 lines of code. The Continuation Lemma and the non-regularity of @{text "a\<^sup>n b\<^sup>n"} 
  2420   require 70 lines of code.
  2429   require 70 lines of code.
  2421   The algorithm for solving equational systems, which we
  2430   The algorithm for solving equational systems, which we
  2422   used in the first direction, is conceptually relatively simple. Still the
  2431   used in the first direction, is conceptually relatively simple. Still the
  2423   use of sets over which the algorithm operates means it is not as easy to
  2432   use of sets over which the algorithm operates means it is not as easy to
  2424   formalise as one might hope. However, it seems sets cannot be avoided since
  2433   formalise as one might hope. However, it seems sets cannot be avoided since
  2425   the `input' of the algorithm consists of equivalence classes and we cannot
  2434   the `input' of the algorithm consists of equivalence classes and we cannot
  2426   see how to reformulate the theory so that we can use lists or matrices. Lists would be
  2435   see how to reformulate the theory so that we can use lists or matrices. Lists would be
  2427   much easier to reason about, since we can define functions over them by
  2436   much easier to reason about, since we can define functions over them by
  2428   recursion. For sets we have to use set-comprehensions, which is slightly
  2437   recursion. For sets we have to use set-comprehensions, which is slightly
  2429   unwieldy. Matrices would allow us to use the slick formalisation by 
  2438   unwieldy. Matrices would allow us to use the slick formalisation by 
  2430   Nipkow of the Gauss-Jordan algorithm \cite{Nipkow11}.
  2439   \citeN{Nipkow11} of the Gauss-Jordan algorithm.
  2431 
  2440 
  2432   While our formalisation might appear large, it should be seen
  2441   While our formalisation might appear large, it should be seen
  2433   in the context of the work done by Constable at al \cite{Constable00} who
  2442   in the context of the work done by \citeN{Constable00} who
  2434   formalised the Myhill-Nerode Theorem in Nuprl using automata. They write
  2443   formalised the Myhill-Nerode Theorem in Nuprl using automata. They write
  2435   that their four-member team would need something on the magnitude of 18 months
  2444   that their four-member team would need something on the magnitude of 18 months
  2436   for their formalisation of the first eleven chapters of \cite{HopcroftUllman69}, 
  2445   for their formalisation of the first eleven chapters of the textbook by \citeN{HopcroftUllman69}, 
  2437   which includes the Myhill-Nerode theorem. It is hard to gauge the size of a
  2446   which includes the Myhill-Nerode theorem. It is hard to gauge the size of a
  2438   formalisation in Nurpl, but from what is shown in the Nuprl Math Library
  2447   formalisation in Nurpl, but from what is shown in the Nuprl Math Library
  2439   about their development it seems substantially larger than ours. We attribute
  2448   about their development it seems substantially larger than ours. We attribute
  2440   this to our use of regular expressions, which meant we did not need to `fight' 
  2449   this to our use of regular expressions, which meant we did not need to `fight' 
  2441   the theorem prover.
  2450   the theorem prover.
  2442   Also, Filli\^atre reports that his formalisation in 
  2451   Also, \citeN{Filliatre97} reports that his formalisation in 
  2443   Coq of automata theory and Kleene's theorem is ``rather big'' 
  2452   Coq of automata theory and Kleene's theorem is ``rather big''. 
  2444   \cite{Filliatre97}. More recently, Almeida et al reported about another 
  2453   More recently, \citeN{Almeidaetal10}  reported about another 
  2445   formalisation of regular languages in Coq \cite{Almeidaetal10}. Their 
  2454   formalisation of regular languages in Coq. Their 
  2446   main result is the
  2455   main result is the
  2447   correctness of Mirkin's construction of an automaton from a regular
  2456   correctness of Mirkin's construction of an automaton from a regular
  2448   expression using partial derivatives. This took approximately 10600 lines
  2457   expression using partial derivatives. This took approximately 10600 lines
  2449   of code.  Also Braibant formalised a large part of regular language
  2458   of code.  Also \citeN{Braibant12} formalised a large part of regular language
  2450   theory and Kleene algebras in Coq \cite{Braibant12}. While he is mainly interested
  2459   theory and Kleene algebras in Coq. While he is mainly interested
  2451   in implementing decision procedures for Kleene algebras, his library
  2460   in implementing decision procedures for Kleene algebras, his library
  2452   includes a proof of the Myhill-Nerode theorem. He reckons that our 
  2461   includes a proof of the Myhill-Nerode theorem. He reckons that our 
  2453   ``development is more concise'' than his one based on matrices \cite[Page 67]{Braibant12}.
  2462   ``development is more concise'' than his one based on matrices \cite[Page 67]{Braibant12}.
  2454   In terms of time, the estimate for our formalisation is that we
  2463   In terms of time, the estimate for our formalisation is that we
  2455   needed approximately 3 months and this included the time to find our proof
  2464   needed approximately 3 months and this included the time to find our proof
  2456   arguments. Unlike Constable et al, who were able to follow the Myhill-Nerode 
  2465   arguments. Unlike \citeN{Constable00}, who were able to follow the Myhill-Nerode 
  2457   proof from \cite{HopcroftUllman69}, we had to find our own arguments.  So for us the
  2466   proof by \citeN{HopcroftUllman69}, we had to find our own arguments.  So for us the
  2458   formalisation was not the bottleneck.  The code of
  2467   formalisation was not the bottleneck.  The code of
  2459   our formalisation can be found in the Archive of Formal Proofs at
  2468   our formalisation can be found in the Archive of Formal Proofs at
  2460   \mbox{\url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}} 
  2469   \mbox{\url{http://afp.sourceforge.net/entries/Myhill-Nerode.shtml}} 
  2461   \cite{myhillnerodeafp11}.\medskip
  2470   \cite{myhillnerodeafp11}.\smallskip
  2462   
  2471   
  2463   \noindent
  2472   \noindent
  2464   {\bf Acknowledgements:}
  2473   {\bf Acknowledgements:}
  2465   We are grateful for the comments we received from Larry Paulson.  Tobias
  2474   We are grateful for the comments we received from Larry Paulson.  Tobias
  2466   Nipkow made us aware of the properties in Theorem~\ref{subseqreg} and Tjark
  2475   Nipkow made us aware of the properties in Theorem~\ref{subseqreg} and Tjark
  2467   Weber helped us with proving them.
  2476   Weber helped us with proving them. Christian Sternagel provided us with a
  2468 
  2477   version of Higman's Lemma that applies to arbtrary, but finite alphabets. 
  2469   \bibliographystyle{plain}
  2478 
       
  2479   \bibliographystyle{acmtrans}
  2470   \bibliography{root}
  2480   \bibliography{root}
  2471 
  2481 
  2472   \newpage
  2482   \newpage
  2473   \begin{appendix}
  2483   \begin{appendix}
  2474   \section{Appendix$^\star$}
  2484   \section{Appendix$^\star$}