Journal/Paper.thy
changeset 198 b300f2c5d51d
parent 197 cf1c17431dab
child 199 11c3c302fa2e
equal deleted inserted replaced
197:cf1c17431dab 198:b300f2c5d51d
    72   Cons ("_ :: _" [100, 100] 100) and
    72   Cons ("_ :: _" [100, 100] 100) and
    73   Rev ("Rev _" [1000] 100) and
    73   Rev ("Rev _" [1000] 100) and
    74   Der ("Der _ _" [1000, 1000] 100) and
    74   Der ("Der _ _" [1000, 1000] 100) and
    75   ONE ("ONE" 999) and
    75   ONE ("ONE" 999) and
    76   ZERO ("ZERO" 999) and
    76   ZERO ("ZERO" 999) and
    77   pders_lang ("pderl") and
    77   pders_lang ("pdersl") and
    78   UNIV1 ("UNIV\<^isup>+") and
    78   UNIV1 ("UNIV\<^isup>+") and
    79   Ders_lang ("Derl")
    79   Ders_lang ("Dersl")
    80 
    80 
    81 lemma meta_eq_app:
    81 lemma meta_eq_app:
    82   shows "f \<equiv> \<lambda>x. g x \<Longrightarrow> f x \<equiv> g x"
    82   shows "f \<equiv> \<lambda>x. g x \<Longrightarrow> f x \<equiv> g x"
    83   by auto
    83   by auto
    84 
    84 
   157   limitations that hurt badly, if one attempts a simple-minded formalisation
   157   limitations that hurt badly, if one attempts a simple-minded formalisation
   158   of regular languages in it. 
   158   of regular languages in it. 
   159 
   159 
   160   The typical approach to regular languages is to
   160   The typical approach to regular languages is to
   161   introduce finite automata and then define everything in terms of them
   161   introduce finite automata and then define everything in terms of them
   162   \cite{Kozen97}.  For example, a regular language is normally defined as:
   162   \cite{ HopcroftUllman69,Kozen97}.  For example, a regular language is 
       
   163   normally defined as:
   163 
   164 
   164   \begin{dfntn}\label{baddef}
   165   \begin{dfntn}\label{baddef}
   165   A language @{text A} is \emph{regular}, provided there is a 
   166   A language @{text A} is \emph{regular}, provided there is a 
   166   finite deterministic automaton that recognises all strings of @{text "A"}.
   167   finite deterministic automaton that recognises all strings of @{text "A"}.
   167   \end{dfntn}
   168   \end{dfntn}
   249   \end{equation}
   250   \end{equation}
   250 
   251 
   251   \noindent
   252   \noindent
   252   changes the type---the disjoint union is not a set, but a set of
   253   changes the type---the disjoint union is not a set, but a set of
   253   pairs. Using this definition for disjoint union means we do not have a
   254   pairs. Using this definition for disjoint union means we do not have a
   254   single type for automata. As a result we will not be able to define a
   255   single type for the states of automata. As a result we will not be able to define a
   255   regular language as one for which there exists an automaton that recognises
   256   regular language as one for which there exists an automaton that recognises
   256   all its strings (Definition~\ref{baddef}). This is because we cannot make a
   257   all its strings (Definition~\ref{baddef}). This is because we cannot make a
   257   definition in HOL that is polymorphic in the state type and there is no type
   258   definition in HOL that is only polymorphic in the state type and there is no type
   258   quantification available in HOL (unlike in Coq, for example).\footnote{Slind
   259   quantification available in HOL (unlike in Coq, for example).\footnote{Slind
   259   already pointed out this problem in an email to the HOL4 mailing list on
   260   already pointed out this problem in an email to the HOL4 mailing list on
   260   21st April 2005.}
   261   21st April 2005.}
   261 
   262 
   262 
   263   An alternative, which provides us with a single type for states of automata,
   263   An alternative, which provides us with a single type for automata, is to give every 
   264   is to give every state node an identity, for example a natural number, and
   264   state node an identity, for example a natural
   265   then be careful to rename these identities apart whenever connecting two
   265   number, and then be careful to rename these identities apart whenever
   266   automata. This results in clunky proofs establishing that properties are
   266   connecting two automata. This results in clunky proofs
   267   invariant under renaming. Similarly, connecting two automata represented as
   267   establishing that properties are invariant under renaming. Similarly,
   268   matrices results in very adhoc constructions, which are not pleasant to
   268   connecting two automata represented as matrices results in very adhoc
   269   reason about.
   269   constructions, which are not pleasant to reason about.
       
   270 
   270 
   271   Functions are much better supported in Isabelle/HOL, but they still lead to similar
   271   Functions are much better supported in Isabelle/HOL, but they still lead to similar
   272   problems as with graphs.  Composing, for example, two non-deterministic automata in parallel
   272   problems as with graphs.  Composing, for example, two non-deterministic automata in parallel
   273   requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98} 
   273   requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98} 
   274   dismisses for this the option of using identities, because it leads according to 
   274   dismisses for this the option of using identities, because it leads according to 
   432   "\<lbrakk>x\<rbrakk>\<^isub>\<approx> = \<lbrakk>y\<rbrakk>\<^isub>\<approx>"}.
   432   "\<lbrakk>x\<rbrakk>\<^isub>\<approx> = \<lbrakk>y\<rbrakk>\<^isub>\<approx>"}.
   433 
   433 
   434 
   434 
   435   Central to our proof will be the solution of equational systems
   435   Central to our proof will be the solution of equational systems
   436   involving equivalence classes of languages. For this we will use Arden's Lemma 
   436   involving equivalence classes of languages. For this we will use Arden's Lemma 
   437   (see \cite[Page 100]{Sakarovitch09}),
   437   (see for example \cite[Page 100]{Sakarovitch09}),
   438   which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
   438   which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
   439   @{term "[] \<notin> A"}. However we will need the following `reverse' 
   439   @{term "[] \<notin> A"}. However we will need the following `reverse' 
   440   version of Arden's Lemma (`reverse' in the sense of changing the order of @{term "A \<cdot> X"} to
   440   version of Arden's Lemma (`reverse' in the sense of changing the order of @{term "A \<cdot> X"} to
   441   \mbox{@{term "X \<cdot> A"}}).
   441   \mbox{@{term "X \<cdot> A"}}).
   442 
   442 
   446   @{thm (rhs) arden}.
   446   @{thm (rhs) arden}.
   447   \end{lmm}
   447   \end{lmm}
   448 
   448 
   449   \begin{proof}
   449   \begin{proof}
   450   For the right-to-left direction we assume @{thm (rhs) arden} and show
   450   For the right-to-left direction we assume @{thm (rhs) arden} and show
   451   that @{thm (lhs) arden} holds. From Prop.~\ref{langprops}@{text "(i)"} 
   451   that @{thm (lhs) arden} holds. From Property~\ref{langprops}@{text "(i)"} 
   452   we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"},
   452   we have @{term "A\<star> = A \<cdot> A\<star> \<union> {[]}"},
   453   which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both 
   453   which is equal to @{term "A\<star> = A\<star> \<cdot> A \<union> {[]}"}. Adding @{text B} to both 
   454   sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side
   454   sides gives @{term "B \<cdot> A\<star> = B \<cdot> (A\<star> \<cdot> A \<union> {[]})"}, whose right-hand side
   455   is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. This completes this direction. 
   455   is equal to @{term "(B \<cdot> A\<star>) \<cdot> A \<union> B"}. This completes this direction. 
   456 
   456 
   465   Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for
   465   Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for
   466   all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition
   466   all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition
   467   of @{text "\<star>"}.
   467   of @{text "\<star>"}.
   468   For the inclusion in the other direction we assume a string @{text s}
   468   For the inclusion in the other direction we assume a string @{text s}
   469   with length @{text k} is an element in @{text X}. Since @{thm (prem 1) arden}
   469   with length @{text k} is an element in @{text X}. Since @{thm (prem 1) arden}
   470   we know by Prop.~\ref{langprops}@{text "(ii)"} that 
   470   we know by Property~\ref{langprops}@{text "(ii)"} that 
   471   @{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k}
   471   @{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k}
   472   (the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer). 
   472   (the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer). 
   473   From @{text "(*)"} it follows then that
   473   From @{text "(*)"} it follows then that
   474   @{term s} must be an element in @{term "(\<Union>m\<in>{0..k}. B \<cdot> (A \<up> m))"}. This in turn
   474   @{term s} must be an element in @{term "(\<Union>m\<in>{0..k}. B \<cdot> (A \<up> m))"}. This in turn
   475   implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Prop.~\ref{langprops}@{text "(iii)"} 
   475   implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Property~\ref{langprops}@{text "(iii)"} 
   476   this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show.
   476   this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show.
   477   \end{proof}
   477   \end{proof}
   478 
   478 
   479   \noindent
   479   \noindent
   480   Regular expressions are defined as the inductive datatype
   480   Regular expressions are defined as the inductive datatype
   599   equivalence class in \mbox{@{term "finals A"}} (which by assumption must be
   599   equivalence class in \mbox{@{term "finals A"}} (which by assumption must be
   600   a finite set), then we can use @{text "\<bigplus>"} to obtain a regular expression 
   600   a finite set), then we can use @{text "\<bigplus>"} to obtain a regular expression 
   601   that matches every string in @{text A}.
   601   that matches every string in @{text A}.
   602 
   602 
   603 
   603 
   604   Our proof of Thm.~\ref{myhillnerodeone} relies on a method that can calculate a
   604   Our proof of Theorem~\ref{myhillnerodeone} relies on a method that can calculate a
   605   regular expression for \emph{every} equivalence class, not just the ones 
   605   regular expression for \emph{every} equivalence class, not just the ones 
   606   in @{term "finals A"}. We
   606   in @{term "finals A"}. We
   607   first define the notion of \emph{one-character-transition} between 
   607   first define the notion of \emph{one-character-transition} between 
   608   two equivalence classes
   608   two equivalence classes
   609   %
   609   %
   731   \begin{lmm}\label{inv}
   731   \begin{lmm}\label{inv}
   732   If @{thm (prem 1) test} then @{text "X = \<Union> \<calL> ` rhs"}.
   732   If @{thm (prem 1) test} then @{text "X = \<Union> \<calL> ` rhs"}.
   733   \end{lmm}
   733   \end{lmm}
   734 
   734 
   735   \noindent
   735   \noindent
   736   Our proof of Thm.~\ref{myhillnerodeone} will proceed by transforming the
   736   Our proof of Theorem~\ref{myhillnerodeone} will proceed by transforming the
   737   initial equational system into one in \emph{solved form} maintaining the invariant
   737   initial equational system into one in \emph{solved form} maintaining the invariant
   738   in Lem.~\ref{inv}. From the solved form we will be able to read
   738   in Lemma~\ref{inv}. From the solved form we will be able to read
   739   off the regular expressions. 
   739   off the regular expressions. 
   740 
   740 
   741   In order to transform an equational system into solved form, we have two 
   741   In order to transform an equational system into solved form, we have two 
   742   operations: one that takes an equation of the form @{text "X = rhs"} and removes
   742   operations: one that takes an equation of the form @{text "X = rhs"} and removes
   743   any recursive occurrences of @{text X} in the @{text rhs} using our variant of Arden's 
   743   any recursive occurrences of @{text X} in the @{text rhs} using our variant of Arden's 
   889   \begin{center}
   889   \begin{center}
   890   @{thm Solve_def}
   890   @{thm Solve_def}
   891   \end{center}
   891   \end{center}
   892 
   892 
   893   \noindent
   893   \noindent
   894   We are not concerned here with the definition of this operator
   894   We are not concerned here with the definition of this operator (see
   895   (see Berghofer and Nipkow \cite{BerghoferNipkow00}), but note that we eliminate
   895   Berghofer and Nipkow \cite{BerghoferNipkow00} for example), but note that we
   896   in each @{const Iter}-step a single equation, and therefore 
   896   eliminate in each @{const Iter}-step a single equation, and therefore have a
   897   have a well-founded termination order by taking the cardinality 
   897   well-founded termination order by taking the cardinality of the equational
   898   of the equational system @{text ES}. This enables us to prove
   898   system @{text ES}. This enables us to prove properties about our definition
   899   properties about our definition of @{const Solve} when we `call' it with
   899   of @{const Solve} when we `call' it with the equivalence class @{text X} and
   900   the equivalence class @{text X} and the initial equational system 
   900   the initial equational system @{term "Init (UNIV // \<approx>A)"} from
   901   @{term "Init (UNIV // \<approx>A)"} from
       
   902   \eqref{initcs} using the principle:
   901   \eqref{initcs} using the principle:
   903   %
   902 
   904   \begin{equation}\label{whileprinciple}
   903   \begin{equation}\label{whileprinciple}
   905   \mbox{\begin{tabular}{l}
   904   \mbox{\begin{tabular}{l}
   906   @{term "invariant (Init (UNIV // \<approx>A))"} \\
   905   @{term "invariant (Init (UNIV // \<approx>A))"} \\
   907   @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> invariant (Iter X ES)"}\\
   906   @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> invariant (Iter X ES)"}\\
   908   @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> card (Iter X ES) < card ES"}\\
   907   @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> card (Iter X ES) < card ES"}\\
   962   @{thm[mode=IfThen] Init_ES_satisfies_invariant}
   961   @{thm[mode=IfThen] Init_ES_satisfies_invariant}
   963   \end{lmm}
   962   \end{lmm}
   964 
   963 
   965   \begin{proof}
   964   \begin{proof}
   966   Finiteness is given by the assumption and the way how we set up the 
   965   Finiteness is given by the assumption and the way how we set up the 
   967   initial equational system. Soundness is proved in Lem.~\ref{inv}. Distinctness
   966   initial equational system. Soundness is proved in Lemma~\ref{inv}. Distinctness
   968   follows from the fact that the equivalence classes are disjoint. The @{text ardenable}
   967   follows from the fact that the equivalence classes are disjoint. The @{text ardenable}
   969   property also follows from the setup of the initial equational system, as does 
   968   property also follows from the setup of the initial equational system, as does 
   970   validity.
   969   validity.
   971   \end{proof}
   970   \end{proof}
   972 
   971 
   991   \end{center}
   990   \end{center}
   992 
   991 
   993   \noindent
   992   \noindent
   994   Finiteness is straightforward, as the @{const Subst} and @{const Arden} operations 
   993   Finiteness is straightforward, as the @{const Subst} and @{const Arden} operations 
   995   keep the equational system finite. These operations also preserve soundness 
   994   keep the equational system finite. These operations also preserve soundness 
   996   and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
   995   and distinctness (we proved soundness for @{const Arden} in Lemma~\ref{ardenable}).
   997   The property @{text ardenable} is clearly preserved because the append-operation
   996   The property @{text ardenable} is clearly preserved because the append-operation
   998   cannot make a regular expression to match the empty string. Validity is
   997   cannot make a regular expression to match the empty string. Validity is
   999   given because @{const Arden} removes an equivalence class from @{text yrhs}
   998   given because @{const Arden} removes an equivalence class from @{text yrhs}
  1000   and then @{const Subst_all} removes @{text Y} from the equational system.
   999   and then @{const Subst_all} removes @{text Y} from the equational system.
  1001   Having proved the implication above, we can instantiate @{text "ES"} with @{text "ES - {(Y, yrhs)}"}
  1000   Having proved the implication above, we can instantiate @{text "ES"} with @{text "ES - {(Y, yrhs)}"}
  1030   and @{term "invariant {(X, rhs)}"}.
  1029   and @{term "invariant {(X, rhs)}"}.
  1031   \end{lmm}
  1030   \end{lmm}
  1032 
  1031 
  1033   \begin{proof} 
  1032   \begin{proof} 
  1034   In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly
  1033   In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly
  1035   stronger invariant since Lem.~\ref{iterone} and \ref{itertwo} have the precondition 
  1034   stronger invariant since Lemma~\ref{iterone} and \ref{itertwo} have the precondition 
  1036   that @{term "(X, rhs) \<in> ES"} for some @{text rhs}. This precondition is needed
  1035   that @{term "(X, rhs) \<in> ES"} for some @{text rhs}. This precondition is needed
  1037   in order to choose in the @{const Iter}-step an equation that is not \mbox{@{term "X = rhs"}}.
  1036   in order to choose in the @{const Iter}-step an equation that is not \mbox{@{term "X = rhs"}}.
  1038   Therefore our invariant cannot be just @{term "invariant ES"}, but must be 
  1037   Therefore our invariant cannot be just @{term "invariant ES"}, but must be 
  1039   @{term "invariant ES \<and> (\<exists>rhs. (X, rhs) \<in> ES)"}. By assumption 
  1038   @{term "invariant ES \<and> (\<exists>rhs. (X, rhs) \<in> ES)"}. By assumption 
  1040   @{thm (prem 2) Solve} and Lem.~\ref{invzero}, the more general invariant holds for
  1039   @{thm (prem 2) Solve} and Lemma~\ref{invzero}, the more general invariant holds for
  1041   the initial equational system. This is premise 1 of~\eqref{whileprinciple}.
  1040   the initial equational system. This is premise 1 of~\eqref{whileprinciple}.
  1042   Premise 2 is given by Lem.~\ref{iterone} and the fact that @{const Iter} might
  1041   Premise 2 is given by Lemma~\ref{iterone} and the fact that @{const Iter} might
  1043   modify the @{text rhs} in the equation @{term "X = rhs"}, but does not remove it.
  1042   modify the @{text rhs} in the equation @{term "X = rhs"}, but does not remove it.
  1044   Premise 3 of~\eqref{whileprinciple} is by Lem.~\ref{itertwo}. Now in premise 4
  1043   Premise 3 of~\eqref{whileprinciple} is by Lemma~\ref{itertwo}. Now in premise 4
  1045   we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"}
  1044   we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"}
  1046   and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"}
  1045   and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"}
  1047   does not holds. By the stronger invariant we know there exists such a @{text "rhs"}
  1046   does not holds. By the stronger invariant we know there exists such a @{text "rhs"}
  1048   with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality
  1047   with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality
  1049   of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"},
  1048   of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"},
  1064   By the preceding lemma, we know that there exists a @{text "rhs"} such
  1063   By the preceding lemma, we know that there exists a @{text "rhs"} such
  1065   that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"},
  1064   that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"},
  1066   and that the invariant holds for this equation. That means we 
  1065   and that the invariant holds for this equation. That means we 
  1067   know @{text "X = \<Union>\<calL> ` rhs"}. We further know that
  1066   know @{text "X = \<Union>\<calL> ` rhs"}. We further know that
  1068   this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the 
  1067   this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the 
  1069   invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
  1068   invariant and Lemma~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
  1070   we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation
  1069   we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation
  1071   removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
  1070   removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
  1072   This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
  1071   This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
  1073   So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = lang (\<Uplus>rs)"}.
  1072   So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = lang (\<Uplus>rs)"}.
  1074   With this we can conclude the proof.
  1073   With this we can conclude the proof.
  1075   \end{proof}
  1074   \end{proof}
  1076 
  1075 
  1077   \noindent
  1076   \noindent
  1078   Lem.~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
  1077   Lemma~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
  1079   of the Myhill-Nerode theorem.
  1078   of the Myhill-Nerode theorem.
  1080 
  1079 
  1081   \begin{proof}[of Thm.~\ref{myhillnerodeone}]
  1080   \begin{proof}[Proof of Theorem~\ref{myhillnerodeone}]
  1082   By Lem.~\ref{every_eqcl_has_reg} we know that there exists a regular expression for
  1081   By Lemma~\ref{every_eqcl_has_reg} we know that there exists a regular expression for
  1083   every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is
  1082   every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is
  1084   a subset of  @{term "UNIV // \<approx>A"}, we also know that for every equivalence class
  1083   a subset of  @{term "UNIV // \<approx>A"}, we also know that for every equivalence class
  1085   in @{term "finals A"} there exists a regular expression. Moreover by assumption 
  1084   in @{term "finals A"} there exists a regular expression. Moreover by assumption 
  1086   we know that @{term "finals A"} must be finite, and therefore there must be a finite
  1085   we know that @{term "finals A"} must be finite, and therefore there must be a finite
  1087   set of regular expressions @{text "rs"} such that
  1086   set of regular expressions @{text "rs"} such that
  1173   A relation @{text "R\<^isub>1"} is said to \emph{refine} @{text "R\<^isub>2"}
  1172   A relation @{text "R\<^isub>1"} is said to \emph{refine} @{text "R\<^isub>2"}
  1174   provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}. 
  1173   provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}. 
  1175   \end{dfntn}
  1174   \end{dfntn}
  1176 
  1175 
  1177   \noindent
  1176   \noindent
  1178   For constructing @{text R} will
  1177   For constructing @{text R}, will rely on some \emph{tagging-functions}
  1179   rely on some \emph{tagging-functions} defined over strings. Given the
  1178   defined over strings. Given the inductive hypothesis, it will be easy to
  1180   inductive hypothesis, it will be easy to prove that the \emph{range} of
  1179   prove that the \emph{range} of these tagging-functions is finite. The range
  1181   these tagging-functions is finite. The range of a function @{text f} is
  1180   of a function @{text f} is defined as
  1182   defined as
       
  1183 
  1181 
  1184   \begin{center}
  1182   \begin{center}
  1185   @{text "range f \<equiv> f ` UNIV"}
  1183   @{text "range f \<equiv> f ` UNIV"}
  1186   \end{center}
  1184   \end{center}
  1187 
  1185 
  1223   We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have
  1221   We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have
  1224   @{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be
  1222   @{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be
  1225   finite by assumption. Now @{term "f (UNIV // =tag=)"} is a subset of @{text "range f"},
  1223   finite by assumption. Now @{term "f (UNIV // =tag=)"} is a subset of @{text "range f"},
  1226   and so also finite. Injectivity amounts to showing that @{text "X = Y"} under the
  1224   and so also finite. Injectivity amounts to showing that @{text "X = Y"} under the
  1227   assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}.
  1225   assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}.
  1228   From the assumptions we can obtain @{text "x \<in> X"} and @{text "y \<in> Y"} with 
  1226   From the assumptions we obtain \mbox{@{text "x \<in> X"}} and @{text "y \<in> Y"} with 
  1229   @{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in 
  1227   @{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in 
  1230   turn means that the equivalence classes @{text X}
  1228   turn means that the equivalence classes @{text X}
  1231   and @{text Y} must be equal.
  1229   and @{text Y} must be equal. Therefore \eqref{finiteimageD} allows us to conclude
       
  1230   with @{thm (concl) finite_eq_tag_rel}.
  1232   \end{proof}
  1231   \end{proof}
  1233 
  1232 
  1234   \begin{lmm}\label{fintwo} 
  1233   \begin{lmm}\label{fintwo} 
  1235   Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby
  1234   Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby
  1236   @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}.
  1235   @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}.
  1254   are @{text "R\<^isub>1"}-related. Since by assumption @{text "R\<^isub>1"} refines @{text "R\<^isub>2"},
  1253   are @{text "R\<^isub>1"}-related. Since by assumption @{text "R\<^isub>1"} refines @{text "R\<^isub>2"},
  1255   they must also be @{text "R\<^isub>2"}-related, as we need to show.
  1254   they must also be @{text "R\<^isub>2"}-related, as we need to show.
  1256   \end{proof}
  1255   \end{proof}
  1257 
  1256 
  1258   \noindent
  1257   \noindent
  1259   Chaining Lem.~\ref{finone} and \ref{fintwo} together, means in order to show
  1258   Chaining Lemma~\ref{finone} and \ref{fintwo} together, means in order to show
  1260   that @{term "UNIV // \<approx>(lang r)"} is finite, we have to construct a tagging-function whose
  1259   that @{term "UNIV // \<approx>(lang r)"} is finite, we have to construct a tagging-function whose
  1261   range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(lang r)"}.
  1260   range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(lang r)"}.
  1262   Let us attempt the @{const PLUS}-case first. We take as tagging-function 
  1261   Let us attempt the @{const PLUS}-case first. We take as tagging-function 
  1263  
  1262  
  1264   \begin{center}
  1263   \begin{center}
  1308   If we know that @{text "(x\<^isub>p, x\<^isub>s) \<in> Partitions x"}, we will
  1307   If we know that @{text "(x\<^isub>p, x\<^isub>s) \<in> Partitions x"}, we will
  1309   refer to @{text "x\<^isub>p"} as the \emph{prefix} of the string @{text x},
  1308   refer to @{text "x\<^isub>p"} as the \emph{prefix} of the string @{text x},
  1310   and respectively to @{text "x\<^isub>s"} as the \emph{suffix}.
  1309   and respectively to @{text "x\<^isub>s"} as the \emph{suffix}.
  1311 
  1310 
  1312 
  1311 
  1313   Now assuming  @{term "x @ z \<in> A \<cdot> B"} there are only two possible ways of how to `split' 
  1312   Now assuming  @{term "x @ z \<in> A \<cdot> B"}, there are only two possible ways of how to `split' 
  1314   this string to be in @{term "A \<cdot> B"}:
  1313   this string to be in @{term "A \<cdot> B"}:
  1315   %
  1314   %
  1316   \begin{center}
  1315   \begin{center}
  1317   \begin{tabular}{c}
  1316   \begin{tabular}{c}
  1318   \scalebox{1}{
  1317   \scalebox{1}{
  1389   @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\<equiv>"}~
  1388   @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\<equiv>"}~
  1390   @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"}
  1389   @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"}
  1391   \end{center}
  1390   \end{center}
  1392 
  1391 
  1393   \noindent
  1392   \noindent
  1394   We have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do 
  1393   Note that we have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do 
  1395   not know anything about how the string @{term x} is partitioned.
  1394   not know anything about how the string @{term x} is partitioned.
  1396   With this definition in place, let us prove the @{const "Times"}-case.
  1395   With this definition in place, let us prove the @{const "Times"}-case.
  1397 
  1396 
  1398 
  1397 
  1399   \begin{proof}[@{const TIMES}-Case]
  1398   \begin{proof}[@{const TIMES}-Case]
  1500   @{text "x\<^isub>p < x"} and the rest @{term "x\<^isub>s @ z \<in> A\<star>"}. For example the empty string 
  1499   @{text "x\<^isub>p < x"} and the rest @{term "x\<^isub>s @ z \<in> A\<star>"}. For example the empty string 
  1501   @{text "[]"} would do (recall @{term "x \<noteq> []"}).
  1500   @{text "[]"} would do (recall @{term "x \<noteq> []"}).
  1502   There are potentially many such prefixes, but there can only be finitely many of them (the 
  1501   There are potentially many such prefixes, but there can only be finitely many of them (the 
  1503   string @{text x} is finite). Let us therefore choose the longest one and call it 
  1502   string @{text x} is finite). Let us therefore choose the longest one and call it 
  1504   @{text "x\<^bsub>pmax\<^esub>"}. Now for the rest of the string @{text "x\<^isub>s @ z"} we
  1503   @{text "x\<^bsub>pmax\<^esub>"}. Now for the rest of the string @{text "x\<^isub>s @ z"} we
  1505   know it is in @{term "A\<star>"} and cannot be the empty string. By Prop.~\ref{langprops}@{text "(iv)"}, 
  1504   know it is in @{term "A\<star>"} and cannot be the empty string. By Property~\ref{langprops}@{text "(iv)"}, 
  1506   we can separate
  1505   we can separate
  1507   this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \<noteq> []"}, @{text "a \<in> A"}
  1506   this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \<noteq> []"}, @{text "a \<in> A"}
  1508   and @{term "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x\<^isub>s"},
  1507   and @{term "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x\<^isub>s"},
  1509   otherwise @{text "x\<^bsub>pmax\<^esub>"} is not the longest prefix. That means @{text a}
  1508   otherwise @{text "x\<^bsub>pmax\<^esub>"} is not the longest prefix. That means @{text a}
  1510   `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and
  1509   `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and
  1614   %   & @{thm (rhs) Ders_simps(3)[where ?s1.0="s\<^isub>1" and ?s2.0="s\<^isub>2"]}\\
  1613   %   & @{thm (rhs) Ders_simps(3)[where ?s1.0="s\<^isub>1" and ?s2.0="s\<^isub>2"]}\\
  1615   \end{tabular}}
  1614   \end{tabular}}
  1616   \end{equation}
  1615   \end{equation}
  1617 
  1616 
  1618   \noindent
  1617   \noindent
  1619   where @{text "\<Delta>"} in the fifth line is a function that tests whether the empty string
  1618   where @{text "\<Delta>"} in the fifth line is a function that tests whether the
  1620   is in the language and returns @{term "{[]}"} or @{term "{}"}, accordingly.
  1619   empty string is in the language and returns @{term "{[]}"} or @{term "{}"},
  1621   The only interesting case is the last one where we use Prop.~\ref{langprops}@{text "(i)"}
  1620   accordingly.  In the last equation we use the list-cons operator written
  1622   in order to infer that @{term "Der c (A\<star>) = Der c (A \<cdot> A\<star>)"}. We can 
  1621   \mbox{@{text "_ :: _"}}.  The only interesting case is the @{text "A\<star>"}-case
  1623   then complete the proof by noting that @{term "Delta A \<cdot> Der c (A\<star>) \<subseteq> (Der c A) \<cdot> A\<star>"}.
  1622   where we use Property~\ref{langprops}@{text "(i)"} in order to infer that
  1624   
  1623   @{term "Der c (A\<star>) = Der c (A \<cdot> A\<star>)"}. We can then complete the proof by
  1625   Brzozowski observed that the left-quotients for languages of regular
  1624   using the fifth equation and noting that @{term "Delta A \<cdot> Der c (A\<star>) \<subseteq> (Der
  1626   expressions can be calculated directly using the notion of \emph{derivatives
  1625   c A) \<cdot> A\<star>"}. 
  1627   of a regular expression} \cite{Brzozowski64}. We define this notion in 
  1626 
  1628   Isabelle/HOL as follows:
  1627   Brzozowski observed that the left-quotients for languages of
       
  1628   regular expressions can be calculated directly using the notion of
       
  1629   \emph{derivatives of a regular expression} \cite{Brzozowski64}. We define
       
  1630   this notion in Isabelle/HOL as follows:
  1629 
  1631 
  1630   \begin{center}
  1632   \begin{center}
  1631   \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
  1633   \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
  1632   @{thm (lhs) der.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) der.simps(1)}\\
  1634   @{thm (lhs) der.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) der.simps(1)}\\
  1633   @{thm (lhs) der.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) der.simps(2)}\\
  1635   @{thm (lhs) der.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) der.simps(2)}\\
  1644   @{thm (lhs) ders.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) ders.simps(2)}\\
  1646   @{thm (lhs) ders.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) ders.simps(2)}\\
  1645   \end{tabular}
  1647   \end{tabular}
  1646   \end{center}
  1648   \end{center}
  1647 
  1649 
  1648   \noindent
  1650   \noindent
  1649   The last two clauses extend derivatives from characters to strings---i.e.~list of
  1651   The last two clauses extend derivatives from characters to strings. The
  1650   characters. The list-cons operator is written \mbox{@{text "_ :: _"}}. The
       
  1651   boolean function @{term "nullable r"} needed in the @{const Times}-case tests
  1652   boolean function @{term "nullable r"} needed in the @{const Times}-case tests
  1652   whether a regular expression can recognise the empty string. It can be defined as 
  1653   whether a regular expression can recognise the empty string. It can be defined as 
  1653   follows.
  1654   follows.
  1654 
  1655 
  1655   \begin{center}
  1656   \begin{center}
  1670   \end{center}
  1671   \end{center}
  1671 
  1672 
  1672   \noindent
  1673   \noindent
  1673   By induction on the regular expression @{text r}, respectively on the string @{text s}, 
  1674   By induction on the regular expression @{text r}, respectively on the string @{text s}, 
  1674   one can easily show that left-quotients and derivatives of regular expressions 
  1675   one can easily show that left-quotients and derivatives of regular expressions 
  1675   relate as follows (for example \cite{Sakarovitch09}):
  1676   relate as follows (see for example~\cite{Sakarovitch09}):
  1676 
  1677 
  1677   \begin{equation}\label{Dersders}
  1678   \begin{equation}\label{Dersders}
  1678   \mbox{\begin{tabular}{c}
  1679   \mbox{\begin{tabular}{c}
  1679   @{thm Der_der}\\
  1680   @{thm Der_der}\\
  1680   @{thm Ders_ders}
  1681   @{thm Ders_ders}
  1701   corresponding language there are only finitely many derivatives---thus
  1702   corresponding language there are only finitely many derivatives---thus
  1702   ensuring that there are only finitely many equivalence
  1703   ensuring that there are only finitely many equivalence
  1703   classes. Unfortunately, this is not true in general. Sakarovitch gives an
  1704   classes. Unfortunately, this is not true in general. Sakarovitch gives an
  1704   example where a regular expression has infinitely many derivatives
  1705   example where a regular expression has infinitely many derivatives
  1705   w.r.t.~the language \mbox{@{term "({a} \<cdot> {b})\<star> \<union> ({a} \<cdot> {b})\<star> \<cdot> {a}"}}
  1706   w.r.t.~the language \mbox{@{term "({a} \<cdot> {b})\<star> \<union> ({a} \<cdot> {b})\<star> \<cdot> {a}"}}
  1706   \cite[Page~141]{Sakarovitch09}.
  1707   (see \cite[Page~141]{Sakarovitch09}).
  1707 
  1708 
  1708 
  1709 
  1709   What Brzozowski \cite{Brzozowski64} established is that for every language there
  1710   What Brzozowski \cite{Brzozowski64} established is that for every language there
  1710   \emph{are} only finitely `dissimilar' derivatives of a regular
  1711   \emph{are} only finitely `dissimilar' derivatives of a regular
  1711   expression. Two regular expressions are said to be \emph{similar} provided
  1712   expression. Two regular expressions are said to be \emph{similar} provided
  1781 
  1782 
  1782   Partial derivatives can be seen as having the @{text "ACI"}-identities already built in: 
  1783   Partial derivatives can be seen as having the @{text "ACI"}-identities already built in: 
  1783   taking the partial derivatives of the
  1784   taking the partial derivatives of the
  1784   regular expressions in \eqref{ACI} gives us in each case
  1785   regular expressions in \eqref{ACI} gives us in each case
  1785   equal sets.  Antimirov \cite{Antimirov95} showed a similar result to
  1786   equal sets.  Antimirov \cite{Antimirov95} showed a similar result to
  1786   \eqref{Dersders} for partial derivatives:
  1787   \eqref{Dersders} for partial derivatives, namely
  1787 
  1788 
  1788   \begin{equation}\label{Derspders}
  1789   \begin{equation}\label{Derspders}
  1789   \mbox{\begin{tabular}{lc}
  1790   \mbox{\begin{tabular}{lc}
  1790   @{text "(i)"}  & @{thm Der_pder}\\
  1791   @{text "(i)"}  & @{thm Der_pder}\\
  1791   @{text "(ii)"} & @{thm Ders_pders}
  1792   @{text "(ii)"} & @{thm Ders_pders}
  1853 
  1854 
  1854   \noindent
  1855   \noindent
  1855   Antimirov's argument first shows this theorem for the language @{term UNIV1}, 
  1856   Antimirov's argument first shows this theorem for the language @{term UNIV1}, 
  1856   which is the set of all non-empty strings. For this he proves:
  1857   which is the set of all non-empty strings. For this he proves:
  1857 
  1858 
  1858   \begin{equation}
  1859   \begin{equation}\label{pdersunivone}
  1859   \mbox{\begin{tabular}{l}
  1860   \mbox{\begin{tabular}{l}
  1860   @{thm pders_lang_Zero}\\
  1861   @{thm pders_lang_Zero}\\
  1861   @{thm pders_lang_One}\\
  1862   @{thm pders_lang_One}\\
  1862   @{thm pders_lang_Atom}\\
  1863   @{thm pders_lang_Atom}\\
  1863   @{thm pders_lang_Plus[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  1864   @{thm pders_lang_Plus[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  1881   \end{center}
  1882   \end{center}
  1882 
  1883 
  1883   \noindent
  1884   \noindent
  1884   and for all languages @{text "A"}, @{thm pders_lang_subset[where B="UNIV",
  1885   and for all languages @{text "A"}, @{thm pders_lang_subset[where B="UNIV",
  1885   simplified]} holds.  Since we follow Antimirov's proof quite closely in our
  1886   simplified]} holds.  Since we follow Antimirov's proof quite closely in our
  1886   formalisation, we omit the details.
  1887   formalisation (only the last two cases of \eqref{pdersunivone} involve some
  1887 
  1888   non-routine induction argument), we omit the details.
  1888 
  1889 
  1889   Let us return to our proof of the second direction in the Myhill-Nerode
  1890 
       
  1891   Let us now return to our proof about the second direction in the Myhill-Nerode
  1890   theorem. The point of the above calculations is to use 
  1892   theorem. The point of the above calculations is to use 
  1891   @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation. 
  1893   @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation. 
  1892 
  1894 
  1893 
  1895 
  1894   \begin{proof}[Proof of Theorem~\ref{myhillnerodetwo} (second version)]
  1896   \begin{proof}[Proof of Theorem~\ref{myhillnerodetwo} (second version)]
  1913 
  1915 
  1914   \noindent
  1916   \noindent
  1915   Now the range of @{term "\<lambda>x. pders x r"} is a subset of @{term "Pow (pders_lang UNIV r)"},
  1917   Now the range of @{term "\<lambda>x. pders x r"} is a subset of @{term "Pow (pders_lang UNIV r)"},
  1916   which we know is finite by Theorem~\ref{antimirov}. This means there 
  1918   which we know is finite by Theorem~\ref{antimirov}. This means there 
  1917   are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"},
  1919   are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"},
  1918   and we can again conclude the second part of the Myhill-Nerode theorem.
  1920   which refines @{term "\<approx>(lang r)"}, and consequently we can again conclude the 
       
  1921   second part of the Myhill-Nerode theorem.
  1919   \end{proof}
  1922   \end{proof}
  1920 *}
  1923 *}
  1921 
  1924 
  1922 section {* Closure Properties of Regular Languages *}
  1925 section {* Closure Properties of Regular Languages *}
  1923 
  1926 
  1927   operations. Closure under union, concatenation and Kleene-star are trivial
  1930   operations. Closure under union, concatenation and Kleene-star are trivial
  1928   to establish given our definition of regularity (recall Definition~\ref{regular}).
  1931   to establish given our definition of regularity (recall Definition~\ref{regular}).
  1929   More interesting is the closure under complement, because it seems difficult
  1932   More interesting is the closure under complement, because it seems difficult
  1930   to construct a regular expression for the complement language by direct
  1933   to construct a regular expression for the complement language by direct
  1931   means. However the existence of such a regular expression can now be easily
  1934   means. However the existence of such a regular expression can now be easily
  1932   proved using the Myhill-Nerode theorem since
  1935   proved using both parts of the Myhill-Nerode theorem, since
  1933 
  1936 
  1934   \begin{center}
  1937   \begin{center}
  1935   @{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"}
  1938   @{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"}
  1936   \end{center}
  1939   \end{center}
  1937   
  1940   
  1938   \noindent
  1941   \noindent
  1939   holds for any strings @{text "s\<^isub>1"} and @{text
  1942   holds for any strings @{text "s\<^isub>1"} and @{text
  1940   "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"}
  1943   "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"}
  1941   give rise to the same partitions. So if one is finite, the other is too, and
  1944   give rise to the same partitions. So if one is finite, the other is too, and
  1942   the other way around. Proving the existence of such a regular expression via
  1945   vice versa. Proving the existence of such a regular expression via
  1943   automata using the standard method would be quite involved. It includes the
  1946   automata using the standard method would be quite involved. It includes the
  1944   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
  1947   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
  1945   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
  1948   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
  1946   regular expression. Clearly not something you want to formalise in a theorem
  1949   regular expression. Clearly not something you want to formalise in a theorem
  1947   prover in which it is cumbersome to reason about automata.
  1950   prover in which it is cumbersome to reason about automata.
  2002   "Ders_lang B A"} is regular. To see this consider the following argument
  2005   "Ders_lang B A"} is regular. To see this consider the following argument
  2003   using partial derivatives: From @{text A} being regular we know there exists
  2006   using partial derivatives: From @{text A} being regular we know there exists
  2004   a regular expression @{text r} such that @{term "A = lang r"}. We also know
  2007   a regular expression @{text r} such that @{term "A = lang r"}. We also know
  2005   that @{term "pders_lang B r"} is finite for every language @{text B} and 
  2008   that @{term "pders_lang B r"} is finite for every language @{text B} and 
  2006   regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition 
  2009   regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition 
  2007   and Lemma~\ref{Derspders} therefore
  2010   and \eqref{Derspders} therefore
  2008 
  2011 
  2009   
  2012   
  2010   \begin{equation}\label{eqq}
  2013   \begin{equation}\label{eqq}
  2011   @{term "Ders_lang B (lang r) = (\<Union> lang ` (pders_lang B r))"}
  2014   @{term "Ders_lang B (lang r) = (\<Union> lang ` (pders_lang B r))"}
  2012   \end{equation}
  2015   \end{equation}
  2013 
  2016 
  2014   \noindent
  2017   \noindent
  2015   Since there are only finitely many regular expressions in @{term "pders_lang
  2018   Since there are only finitely many regular expressions in @{term "pders_lang
  2016   B r"}, we know by \eqref{uplus} that there exists a regular expression that
  2019   B r"}, we know by \eqref{uplus} that there exists a regular expression so that
  2017   the right-hand side of \eqref{eqq} is equal to \mbox{@{term "lang (\<Uplus>(pders_lang B
  2020   the right-hand side of \eqref{eqq} is equal to the language \mbox{@{term "lang (\<Uplus>(pders_lang B
  2018   r))"}}. Thus the regular expression @{term "\<Uplus>(pders_lang B r)"} verifies that
  2021   r))"}}. Thus the regular expression @{term "\<Uplus>(pders_lang B r)"} verifies that
  2019   @{term "Ders_lang B A"} is regular.
  2022   @{term "Ders_lang B A"} is regular.
  2020 *}
  2023 *}
  2021 
  2024 
  2022 
  2025 
  2038   \noindent
  2041   \noindent
  2039   Having formalised this theorem means we pushed our point of view quite
  2042   Having formalised this theorem means we pushed our point of view quite
  2040   far. Using this theorem we can obviously prove when a language is \emph{not}
  2043   far. Using this theorem we can obviously prove when a language is \emph{not}
  2041   regular---by establishing that it has infinitely many equivalence classes
  2044   regular---by establishing that it has infinitely many equivalence classes
  2042   generated by the Myhill-Nerode relation (this is usually the purpose of the
  2045   generated by the Myhill-Nerode relation (this is usually the purpose of the
  2043   pumping lemma \cite{Kozen97}).  We can also use it to establish the standard
  2046   Pumping Lemma \cite{Kozen97}).  We can also use it to establish the standard
  2044   textbook results about closure properties of regular languages. Interesting
  2047   textbook results about closure properties of regular languages. Interesting
  2045   is the case of closure under complement, because it seems difficult to
  2048   is the case of closure under complement, because it seems difficult to
  2046   construct a regular expression for the complement language by direct
  2049   construct a regular expression for the complement language by direct
  2047   means. However the existence of such a regular expression can be easily
  2050   means. However the existence of such a regular expression can be easily
  2048   proved using the Myhill-Nerode theorem.  
  2051   proved using the Myhill-Nerode theorem.  
  2049 
  2052 
  2050   Our insistence on regular expressions for proving the Myhill-Nerode theorem
  2053   Our insistence on regular expressions for proving the Myhill-Nerode theorem
  2051   arose from the limitations of HOL on which the popular theorem provers HOL4,
  2054   arose from the limitations of HOL, on which the popular theorem provers HOL4,
  2052   HOLlight and Isabelle/HOL are based. In order to guarantee consistency,
  2055   HOLlight and Isabelle/HOL are based. In order to guarantee consistency,
  2053   formalisations can only extend HOL by definitions that introduce a notion in
  2056   formalisations can only extend HOL by definitions that introduce a new concept in
  2054   terms of already existing concepts. A convenient definition for automata
  2057   terms of already existing notions. A convenient definition for automata
  2055   (based on graphs) uses a polymorphic type for the state nodes. This allows
  2058   (based on graphs) uses a polymorphic type for the state nodes. This allows
  2056   us to use the standard operation of disjoint union in order to compose two
  2059   us to use the standard operation of disjoint union in order to compose two
  2057   automata. Unfortunately, we cannot use such a polymorphic definition of
  2060   automata. Unfortunately, we cannot use such a polymorphic definition
  2058   in HOL as part of the definition for regularity of a language (a
  2061   in HOL as part of the definition for regularity of a language (a
  2059   set of strings).  Consider the following attempt
  2062   set of strings).  Consider the following attempt
  2060 
  2063 
  2061   \begin{center}
  2064   \begin{center}
  2062   @{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_finite_automata (M) \<and> \<calL>(M) = A"}
  2065   @{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_finite_automata (M) \<and> \<calL>(M) = A"}
  2063   \end{center}
  2066   \end{center}
  2064 
  2067 
  2065   \noindent
  2068   \noindent
  2066   which means the definiens is polymorphic in the type of the automata @{text
  2069   which means the definiens is polymorphic in the type of the automata @{text
  2067   "M"}, but the definiendum @{text "is_regular"} is not. Such definitions are
  2070   "M"}, but the definiendum @{text "is_regular"} is not. Such definitions are
  2068   excluded in HOL, because they lead easily to inconsistencies (see
  2071   excluded in HOL, because they can lead easily to inconsistencies (see
  2069   \cite{PittsHOL4} for a simple example). Also HOL does not contain
  2072   \cite{PittsHOL4} for a simple example). Also HOL does not contain
  2070   type-quantifiers which would allow us to get rid of the polymorphism by
  2073   type-quantifiers which would allow us to get rid of the polymorphism by
  2071   quantifying over the type-variable @{text "\<alpha>"}. Therefore when defining
  2074   quantifying over the type-variable @{text "\<alpha>"}. Therefore when defining
  2072   regularity in terms of automata, the only natural way out in HOL is to use state
  2075   regularity in terms of automata, the only natural way out in HOL is to use
  2073   nodes with an identity, for example a natural number. Unfortunatly, the
  2076   state nodes with an identity, for example a natural number. Unfortunatly,
  2074   consequence is that we have to be careful when combining two automata so
  2077   the consequence is that we have to be careful when combining two automata so
  2075   that there is no clash between two such states. This makes formalisations
  2078   that there is no clash between two such states. This makes formalisations
  2076   quite fiddly and unpleasant. Regular expressions proved much more convenient
  2079   quite fiddly and rather unpleasant. Regular expressions proved much more
  2077   for reasoning in HOL and we showed they can be used for establishing the
  2080   convenient for reasoning in HOL and we showed they can be used for
  2078   Myhill-Nerode theorem.
  2081   establishing the central result in regular language theory: the Myhill-Nerode 
       
  2082   theorem.
  2079 
  2083 
  2080   While regular expressions are convenient, they have some limitations. One is
  2084   While regular expressions are convenient, they have some limitations. One is
  2081   that there seems to be no method of calculating a minimal regular expression
  2085   that there seems to be no method of calculating a minimal regular expression
  2082   (for example in terms of length) for a regular language, like there is for
  2086   (for example in terms of length) for a regular language, like there is for
  2083   automata. On the other hand, efficient regular expression matching, without
  2087   automata. On the other hand, efficient regular expression matching, without
  2084   using automata, poses no problem \cite{OwensReppyTuron09}.  For an
  2088   using automata, poses no problem \cite{OwensReppyTuron09}.  For an
  2085   implementation of a simple regular expression matcher, whose correctness has
  2089   implementation of a simple regular expression matcher, whose correctness has
  2086   been formally established, we refer the reader to Owens and Slind
  2090   been formally established, we refer the reader to Owens and Slind
  2087   \cite{OwensSlind08}.
  2091   \cite{OwensSlind08}.
  2088 
       
  2089   Our formalisation consists of 780 lines of Isabelle/Isar code for the first
       
  2090   direction and 460 for the second, plus around 300 lines of standard material
       
  2091   about regular languages. The formalisation about derivatives and partial
       
  2092   derivatives shown in Section~\ref{derivatives} consists of 390 lines of
       
  2093   code.  The algorithm for solving equational systems, which we used in the
       
  2094   first direction, is conceptually not that complicated. Still the use of sets
       
  2095   over which the algorithm operates, means it is not as easy to formalise as
       
  2096   one might wish. It seems sets cannot be avoided since the `input' of the
       
  2097   algorithm consists of equivalence classes and we cannot see how to
       
  2098   reformulate the theory so that we can use lists, which are usually easier to
       
  2099   reason about in a theorem prover.
       
  2100 
       
  2101   While our formalisation might be seen large, it should be seen
       
  2102   in the context of the work done by Constable at al \cite{Constable00} who
       
  2103   formalised the Myhill-Nerode theorem in Nuprl using automata. They write
       
  2104   that their four-member team needed something on the magnitude of 18 months
       
  2105   for their formalisation. The estimate for our formalisation is that we
       
  2106   needed approximately 3 months and this included the time to find our proof
       
  2107   arguments. Unlike Constable et al, who were able to follow the proofs from
       
  2108   \cite{HopcroftUllman69}, we had to find our own arguments.  So for us the
       
  2109   formalisation was not the bottleneck. It is hard to gauge the size of a
       
  2110   formalisation in Nurpl, but from what is shown in the Nuprl Math Library
       
  2111   about their development it seems substantially larger than ours. The code of
       
  2112   ours can be found in the Mercurial Repository at
       
  2113   \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.
       
  2114 
  2092 
  2115   Our proof of the first direction is very much inspired by \emph{Brzozowski's
  2093   Our proof of the first direction is very much inspired by \emph{Brzozowski's
  2116   algebraic method} used to convert a finite automaton to a regular expression
  2094   algebraic method} used to convert a finite automaton to a regular expression
  2117   \cite{Brzozowski64}. The close connection can be seen by considering the
  2095   \cite{Brzozowski64}. The close connection can be seen by considering the
  2118   equivalence classes as the states of the minimal automaton for the regular
  2096   equivalence classes as the states of the minimal automaton for the regular
  2128   in the literature about this way of proving the first direction of the
  2106   in the literature about this way of proving the first direction of the
  2129   Myhill-Nerode theorem, but it appears to be folklore.
  2107   Myhill-Nerode theorem, but it appears to be folklore.
  2130 
  2108 
  2131   We presented two proofs for the second direction of the Myhill-Nerode
  2109   We presented two proofs for the second direction of the Myhill-Nerode
  2132   theorem. One direct proof using tagging-functions and another using partial
  2110   theorem. One direct proof using tagging-functions and another using partial
  2133   derivatives. These proofs is where our method shines, because we can
  2111   derivatives. This part of our work is where our method using regular
  2134   completely side-step the standard argument \cite{Kozen97} where automata
  2112   expressions shines, because we can completely side-step the standard
  2135   need to be composed. However, it is also the direction where we had to spend
  2113   argument \cite{Kozen97} where automata need to be composed. However, it is
  2136   most of the `conceptual' time, as our first proof based on
  2114   also the direction where we had to spend most of the `conceptual' time, as
  2137   tagging-functions is new for establishing the Myhill-Nerode theorem. All
  2115   our first proof based on tagging-functions is new for establishing the
  2138   standard proofs of this direction proceed by arguments over automata.
  2116   Myhill-Nerode theorem. All standard proofs of this direction proceed by
  2139   
  2117   arguments over automata.
  2140   Our indirect proof for the second direction arose from the interest in
  2118   
       
  2119   The indirect proof for the second direction arose from our interest in
  2141   Brzozowski's derivatives for regular expression matching. A corresponding
  2120   Brzozowski's derivatives for regular expression matching. A corresponding
  2142   regular expression matcher has been formalised in HOL4 in
  2121   regular expression matcher has been formalised by Owens and Slind in HOL4
  2143   \cite{OwensSlind08}. In our opinion, this formalisation is considerably
  2122   \cite{OwensSlind08}. In our opinion, their formalisation is considerably
  2144   slicker than for example the approach to regular expression matching taken
  2123   slicker than for example the approach to regular expression matching taken
  2145   in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to
  2124   in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to a
  2146   simple regular expression matchers and he proved that there are only
  2125   simple regular expression matcher and he established that there are only
  2147   finitely many dissimilar derivatives for every regular expression, this
  2126   finitely many dissimilar derivatives for every regular expression, this
  2148   result is not as straightforward to formalise in a theorem prover. The
  2127   result is not as straightforward to formalise in a theorem prover. The
  2149   reason is that the set of dissimilar derivatives is not defined inductively,
  2128   reason is that the set of dissimilar derivatives is not defined inductively,
  2150   but in terms of an ACI-equivalence relation.
  2129   but in terms of an ACI-equivalence relation. This difficulty prevented for
  2151 
  2130   example Krauss and Nipkow to prove termination of their equivalence checker
  2152   
  2131   for regular expressions \cite{KraussNipkow11}. Their checker is based on
  2153 
  2132   derivatives and for their argument the lack of a formal proof of termination
  2154   \medskip
  2133   is not crucial (it merely lets them ``sleep better'' \cite{KraussNipkow11}).
  2155 
  2134   We expect that their development simplifies by using partial derivatives,
  2156   We expect that the development of Krauss \& Nipkow gets easier by
  2135   instead of derivatives, and that termination of the algorithm can be
  2157   using partial derivatives.\medskip
  2136   formally established. However, since partial derivatives use sets of regular
       
  2137   expressions, one needs to carefully analyse whether the resulting algorithm
       
  2138   is still executable. Given the existing infrastructure for executable sets
       
  2139   in Isabelle/HOL, it should.
       
  2140 
       
  2141   Our formalisation of the Myhill-Nerode theorem consists of 780 lines of
       
  2142   Isabelle/Isar code for the first direction and 460 for the second (the one
       
  2143   based on tagging functions), plus around 300 lines of standard material
       
  2144   about regular languages. The formalisation about derivatives and partial
       
  2145   derivatives shown in Section~\ref{derivatives} consists of 390 lines of
       
  2146   code.  The algorithm for solving equational systems, which we used in the
       
  2147   first direction, is conceptually relatively simple. Still the use of sets
       
  2148   over which the algorithm operates means it is not as easy to formalise as
       
  2149   one might hope. However, it seems sets cannot be avoided since the `input'
       
  2150   of the algorithm consists of equivalence classes and we cannot see how to
       
  2151   reformulate the theory so that we can use lists. Lists would be much easier
       
  2152   to reason about, since we can define function over them by recursion. For
       
  2153   sets we have to use set-comprehensions.
       
  2154 
       
  2155   While our formalisation might be seen large, it should be seen
       
  2156   in the context of the work done by Constable at al \cite{Constable00} who
       
  2157   formalised the Myhill-Nerode theorem in Nuprl using automata. They write
       
  2158   that their four-member team needed something on the magnitude of 18 months
       
  2159   for their formalisation. The estimate for our formalisation is that we
       
  2160   needed approximately 3 months and this included the time to find our proof
       
  2161   arguments. Unlike Constable et al, who were able to follow the proofs from
       
  2162   \cite{HopcroftUllman69}, we had to find our own arguments.  So for us the
       
  2163   formalisation was not the bottleneck. It is hard to gauge the size of a
       
  2164   formalisation in Nurpl, but from what is shown in the Nuprl Math Library
       
  2165   about their development it seems substantially larger than ours. The code of
       
  2166   ours can be found in the Mercurial Repository at
       
  2167   \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.\medskip
  2158   
  2168   
  2159   \noindent
  2169   \noindent
  2160   {\bf Acknowledgements:}
  2170   {\bf Acknowledgements:}
  2161   We are grateful for the comments we received from Larry
  2171   We are grateful for the comments we received from Larry
  2162   Paulson.
  2172   Paulson.