Journal/Paper.thy
changeset 383 c8cb6914f4c8
parent 381 99161cd17c0f
child 384 60bcf13adb77
equal deleted inserted replaced
382:29915abff9c2 383:c8cb6914f4c8
   440   that matches all strings of @{text "A"}.
   440   that matches all strings of @{text "A"}.
   441   \end{definition}
   441   \end{definition}
   442   
   442   
   443   \noindent
   443   \noindent
   444   We then want to `forget' automata completely and see how far we come
   444   We then want to `forget' automata completely and see how far we come
   445   with formalising results from regular language theory.  The reason 
   445   with formalising results from regular language theory by only using regular 
       
   446   expressions.  The reason 
   446   is that regular expressions, unlike graphs, matrices and
   447   is that regular expressions, unlike graphs, matrices and
   447   functions, can be defined as an inductive datatype and a reasoning
   448   functions, can be defined as an inductive datatype and a reasoning
   448   infrastructure for them (like induction and recursion) comes for
   449   infrastructure for them (like induction and recursion) comes for
   449   free in HOL. But this question whether formal language theory can be
   450   free in HOL. 
   450   done without automata crops up also in non-theorem prover
   451 
   451   contexts. For example, Gasarch asked in the Computational Complexity
   452   While our choice of using regular expressions is motivated by
   452   blog by \citeN{GasarchBlog} whether the complementation of
   453   shortcomings in theorem provers, the question whether formal
   453   regular-expression languages can be proved without using
   454   language theory can be done without automata crops up also in
   454   automata. He concluded
   455   non-theorem prover contexts. For example, Gasarch asked in the
       
   456   Computational Complexity blog by \citeN{GasarchBlog} whether the
       
   457   complementation of regular-expression languages can be proved
       
   458   without using automata. He concluded
   455 
   459 
   456   \begin{quote}
   460   \begin{quote}
   457   ``{\it \ldots it can't be done}''
   461   ``{\it \ldots it can't be done}''
   458   \end{quote} 
   462   \end{quote} 
   459 
   463 
   464   ``{\it \ldots [b]ut is there a rigorous way to even state that?}'' 
   468   ``{\it \ldots [b]ut is there a rigorous way to even state that?}'' 
   465   \end{quote} 
   469   \end{quote} 
   466 
   470 
   467   %Moreover, no side-conditions will be needed for regular expressions,
   471   %Moreover, no side-conditions will be needed for regular expressions,
   468   %like we need for automata. 
   472   %like we need for automata. 
   469   \noindent
   473  
   470   We will give an answer to these questions in this paper.
       
   471 
   474 
   472   The convenience of regular expressions has
   475   The convenience of regular expressions has
   473   recently been exploited in HOL4 with a formalisation of regular expression
   476   recently been exploited in HOL4 with a formalisation of regular expression
   474   matching based on derivatives by \citeN{OwensSlind08}, and with an equivalence
   477   matching based on derivatives by \citeN{OwensSlind08}, and with an equivalence
   475   checker for regular expressions in Isabelle/HOL by \citeN{KraussNipkow11}
   478   checker for regular expressions in Isabelle/HOL by \citeN{KraussNipkow11}
  1075   \begin{proof}
  1078   \begin{proof}
  1076   Finiteness is given by the assumption and the way how we set up the 
  1079   Finiteness is given by the assumption and the way how we set up the 
  1077   initial equational system. Soundness is proved in Lemma~\ref{inv}. Distinctness
  1080   initial equational system. Soundness is proved in Lemma~\ref{inv}. Distinctness
  1078   follows from the fact that the equivalence classes are disjoint. The @{text ardenable}
  1081   follows from the fact that the equivalence classes are disjoint. The @{text ardenable}
  1079   property also follows from the setup of the initial equational system, as does 
  1082   property also follows from the setup of the initial equational system, as does 
  1080   validity.
  1083   validity.\qed
  1081   \end{proof}
  1084   \end{proof}
  1082 
  1085 
  1083   \noindent
  1086   \noindent
  1084   Next we show that @{text Iter} preserves the invariant.
  1087   Next we show that @{text Iter} preserves the invariant.
  1085 
  1088 
  1112   given because @{const Arden} removes an equivalence class from @{text yrhs}
  1115   given because @{const Arden} removes an equivalence class from @{text yrhs}
  1113   and then @{const Subst_all} removes @{text Y} from the equational system.
  1116   and then @{const Subst_all} removes @{text Y} from the equational system.
  1114   Having proved the implication above, we can instantiate @{text "ES"} with @{text "ES - {(Y, yrhs)}"}
  1117   Having proved the implication above, we can instantiate @{text "ES"} with @{text "ES - {(Y, yrhs)}"}
  1115   which matches with our proof-obligation of @{const "Subst_all"}. Since
  1118   which matches with our proof-obligation of @{const "Subst_all"}. Since
  1116   \mbox{@{term "ES = ES - {(Y, yrhs)} \<union> {(Y, yrhs)}"}}, we can use the assumption 
  1119   \mbox{@{term "ES = ES - {(Y, yrhs)} \<union> {(Y, yrhs)}"}}, we can use the assumption 
  1117   to complete the proof.
  1120   to complete the proof.\qed
  1118   \end{proof}
  1121   \end{proof}
  1119 
  1122 
  1120   \noindent
  1123   \noindent
  1121   We also need the fact that @{text Iter} decreases the termination measure.
  1124   We also need the fact that @{text Iter} decreases the termination measure.
  1122 
  1125 
  1131   By assumption we know that @{text "ES"} is finite and has more than one element.
  1134   By assumption we know that @{text "ES"} is finite and has more than one element.
  1132   Therefore there must be an element @{term "(Y, yrhs) \<in> ES"} with 
  1135   Therefore there must be an element @{term "(Y, yrhs) \<in> ES"} with 
  1133   @{term "(Y, yrhs) \<noteq> (X, rhs)"}. Using the distinctness property we can infer
  1136   @{term "(Y, yrhs) \<noteq> (X, rhs)"}. Using the distinctness property we can infer
  1134   that @{term "Y \<noteq> X"}. We further know that @{text "Remove ES Y yrhs"}
  1137   that @{term "Y \<noteq> X"}. We further know that @{text "Remove ES Y yrhs"}
  1135   removes the equation @{text "Y = yrhs"} from the system, and therefore 
  1138   removes the equation @{text "Y = yrhs"} from the system, and therefore 
  1136   the cardinality of @{const Iter} strictly decreases.
  1139   the cardinality of @{const Iter} strictly decreases.\qed
  1137   \end{proof}
  1140   \end{proof}
  1138 
  1141 
  1139   \noindent
  1142   \noindent
  1140   This brings us to our property we want to establish for @{text Solve}.
  1143   This brings us to our property we want to establish for @{text Solve}.
  1141 
  1144 
  1163   does not hold. By the stronger invariant we know there exists such a @{text "rhs"}
  1166   does not hold. By the stronger invariant we know there exists such a @{text "rhs"}
  1164   with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality
  1167   with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality
  1165   of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"},
  1168   of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"},
  1166   for which the invariant holds. This allows us to conclude that 
  1169   for which the invariant holds. This allows us to conclude that 
  1167   @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"} and @{term "invariant {(X, rhs)}"} hold,
  1170   @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"} and @{term "invariant {(X, rhs)}"} hold,
  1168   as needed.
  1171   as needed.\qed
  1169   \end{proof}
  1172   \end{proof}
  1170 
  1173 
  1171   \noindent
  1174   \noindent
  1172   With this lemma in place we can show that for every equivalence class in @{term "UNIV // \<approx>A"}
  1175   With this lemma in place we can show that for every equivalence class in @{term "UNIV // \<approx>A"}
  1173   there exists a regular expression.
  1176   there exists a regular expression.
  1185   invariant and Lemma~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
  1188   invariant and Lemma~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
  1186   we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation
  1189   we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation
  1187   removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
  1190   removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
  1188   This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
  1191   This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
  1189   So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = lang (\<Uplus>rs)"}.
  1192   So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = lang (\<Uplus>rs)"}.
  1190   With this we can conclude the proof.
  1193   With this we can conclude the proof.\qed
  1191   \end{proof}
  1194   \end{proof}
  1192 
  1195 
  1193   \noindent
  1196   \noindent
  1194   Lemma~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
  1197   Lemma~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
  1195   of the Myhill-Nerode Theorem.
  1198   of the Myhill-Nerode Theorem.
  1209   in @{term "finals A"} there exists a regular expression. Moreover by assumption 
  1212   in @{term "finals A"} there exists a regular expression. Moreover by assumption 
  1210   we know that @{term "finals A"} must be finite, and therefore there must be a finite
  1213   we know that @{term "finals A"} must be finite, and therefore there must be a finite
  1211   set of regular expressions @{text "rs"} such that
  1214   set of regular expressions @{text "rs"} such that
  1212   @{term "\<Union>(finals A) = lang (\<Uplus>rs)"}.
  1215   @{term "\<Union>(finals A) = lang (\<Uplus>rs)"}.
  1213   Since the left-hand side is equal to @{text A}, we can use @{term "\<Uplus>rs"} 
  1216   Since the left-hand side is equal to @{text A}, we can use @{term "\<Uplus>rs"} 
  1214   as the regular expression that is needed in the theorem.
  1217   as the regular expression that is needed in the theorem.\qed
  1215   \end{proof}
  1218   \end{proof}
  1216 
  1219 
  1217   \noindent
  1220   \noindent
  1218   Solving the equational system of our running example gives as solution for the 
  1221   Solving the equational system of our running example gives as solution for the 
  1219   two final equivalence classes:
  1222   two final equivalence classes:
  1265   @{thm quot_atom_subset}
  1268   @{thm quot_atom_subset}
  1266   \end{tabular}
  1269   \end{tabular}
  1267   \end{center}
  1270   \end{center}
  1268 
  1271 
  1269   \noindent
  1272   \noindent
  1270   hold, which shows that @{term "UNIV // \<approx>(lang r)"} must be finite.
  1273   hold, which shows that @{term "UNIV // \<approx>(lang r)"} must be finite.\qed
  1271   \end{proof}
  1274   \end{proof}
  1272 
  1275 
  1273   \noindent
  1276   \noindent
  1274   Much more interesting, however, are the inductive cases. They seem hard to be solved 
  1277   Much more interesting, however, are the inductive cases. They seem hard to be solved 
  1275   directly. The reader is invited to try.\footnote{The induction hypothesis is not strong enough 
  1278   directly. The reader is invited to try.\footnote{The induction hypothesis is not strong enough 
  1319   provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}. 
  1322   provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}. 
  1320   \end{definition}
  1323   \end{definition}
  1321 
  1324 
  1322   \noindent
  1325   \noindent
  1323   For constructing @{text R}, we will rely on some \emph{tagging-functions}
  1326   For constructing @{text R}, we will rely on some \emph{tagging-functions}
  1324   defined over strings, see Fig.~\ref{tagfig}. They are parameterised by sets
  1327   defined over strings, see Fig.~\ref{tagfig}. These functions are parameterised by sets
  1325   of strings @{text A} and @{text B} standing for the induction hypotheses.
  1328   of strings @{text A} and @{text B} standing for arbitrary languages and will
       
  1329   by instantiated with the induction hypotheses.
  1326   Given the inductive hypotheses, it will be easy to
  1330   Given the inductive hypotheses, it will be easy to
  1327   prove that the \emph{range} of these tagging-functions is finite. The range
  1331   prove that the \emph{range} of these tagging-functions is finite. The range
  1328   of a function @{text f} is defined as
  1332   of a function @{text f} is defined as
  1329 
  1333 
  1330  
  1334  
  1342   as follows.
  1346   as follows.
  1343 
  1347 
  1344 
  1348 
  1345   \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
  1349   \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
  1346   and @{text y} are \emph{tag-related} provided
  1350   and @{text y} are \emph{tag-related} provided
  1347   \begin{center}
  1351   \[
  1348   @{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y \<equiv> tag x = tag y"}\;.
  1352   @{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y \<equiv> tag x = tag y"}\;.
  1349   \end{center}
  1353   \]
  1350   \end{definition}
  1354   \end{definition}
  1351 
  1355 
  1352 
  1356 
  1353   In order to establish finiteness of a set @{text A}, we shall use the following powerful
  1357   In order to establish finiteness of a set @{text A}, we shall use the following powerful
  1354   principle from Isabelle/HOL's library.
  1358   principle from Isabelle/HOL's library.
  1392   assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}.
  1396   assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}.
  1393   From the assumptions we obtain \mbox{@{text "x \<in> X"}} and @{text "y \<in> Y"} with 
  1397   From the assumptions we obtain \mbox{@{text "x \<in> X"}} and @{text "y \<in> Y"} with 
  1394   @{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in 
  1398   @{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in 
  1395   turn means that the equivalence classes @{text X}
  1399   turn means that the equivalence classes @{text X}
  1396   and @{text Y} must be equal. Therefore \eqref{finiteimageD} allows us to conclude
  1400   and @{text Y} must be equal. Therefore \eqref{finiteimageD} allows us to conclude
  1397   with @{thm (concl) finite_eq_tag_rel}.
  1401   with @{thm (concl) finite_eq_tag_rel}.\qed
  1398   \end{proof}
  1402   \end{proof}
  1399 
  1403 
  1400   \begin{lemma}\label{fintwo} 
  1404   \begin{lemma}\label{fintwo} 
  1401   Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby
  1405   Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby
  1402   @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}.
  1406   @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}.
  1416   We know there exists a @{text "x \<in> X"} with \mbox{@{term "X = R\<^isub>2 `` {x}"}}. 
  1420   We know there exists a @{text "x \<in> X"} with \mbox{@{term "X = R\<^isub>2 `` {x}"}}. 
  1417   From the latter fact we can infer that @{term "R\<^isub>1 ``{x} \<in> f X"}
  1421   From the latter fact we can infer that @{term "R\<^isub>1 ``{x} \<in> f X"}
  1418   and further @{term "R\<^isub>1 ``{x} \<in> f Y"}. This means we can obtain a @{text y}
  1422   and further @{term "R\<^isub>1 ``{x} \<in> f Y"}. This means we can obtain a @{text y}
  1419   such that @{term "R\<^isub>1 `` {x} = R\<^isub>1 `` {y}"} holds. Consequently @{text x} and @{text y}
  1423   such that @{term "R\<^isub>1 `` {x} = R\<^isub>1 `` {y}"} holds. Consequently @{text x} and @{text y}
  1420   are @{text "R\<^isub>1"}-related. Since by assumption @{text "R\<^isub>1"} refines @{text "R\<^isub>2"},
  1424   are @{text "R\<^isub>1"}-related. Since by assumption @{text "R\<^isub>1"} refines @{text "R\<^isub>2"},
  1421   they must also be @{text "R\<^isub>2"}-related, as we need to show.
  1425   they must also be @{text "R\<^isub>2"}-related, as we need to show.\qed
  1422   \end{proof}
  1426   \end{proof}
  1423 
  1427 
  1424   \noindent
  1428   \noindent
  1425   Chaining Lemma~\ref{finone} and \ref{fintwo} together, means in order to show
  1429   Chaining Lemma~\ref{finone} and \ref{fintwo} together, means in order to show
  1426   that @{term "UNIV // \<approx>(lang r)"} is finite, we have to construct a tagging-function whose
  1430   that @{term "UNIV // \<approx>(lang r)"} is finite, we have to construct a tagging-function whose
  1427   range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(lang r)"}.
  1431   range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(lang r)"}.
  1428   Let us attempt the @{const PLUS}-case first. We take as tagging-function from Fig.~\ref{tagfig}
  1432   Let us attempt the @{const PLUS}-case first. We take from Fig.~\ref{tagfig}
  1429  
  1433  
  1430   \begin{center}
  1434   \begin{center}
  1431   @{thm tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]}
  1435   @{thm tag_Plus_def[where A="A" and B="B", THEN meta_eq_app]}
  1432   \end{center}
  1436   \end{center}
  1433 
  1437 
  1444   holds. The range of @{term "tag_Plus A B"} is a subset of this product
  1448   holds. The range of @{term "tag_Plus A B"} is a subset of this product
  1445   set---so finite. For the refinement proof-obligation, we know that @{term
  1449   set---so finite. For the refinement proof-obligation, we know that @{term
  1446   "(\<approx>A `` {x}, \<approx>B `` {x}) = (\<approx>A `` {y}, \<approx>B `` {y})"} holds by assumption. Then
  1450   "(\<approx>A `` {x}, \<approx>B `` {x}) = (\<approx>A `` {y}, \<approx>B `` {y})"} holds by assumption. Then
  1447   clearly either @{term "x \<approx>A y"} or @{term "x \<approx>B y"}, as we needed to
  1451   clearly either @{term "x \<approx>A y"} or @{term "x \<approx>B y"}, as we needed to
  1448   show. Finally we can discharge this case by setting @{text A} to @{term
  1452   show. Finally we can discharge this case by setting @{text A} to @{term
  1449   "lang r\<^isub>1"} and @{text B} to @{term "lang r\<^isub>2"}.
  1453   "lang r\<^isub>1"} and @{text B} to @{term "lang r\<^isub>2"}.\qed
  1450   \end{proof}
  1454   \end{proof}
  1451 
  1455 
  1452   \noindent
  1456   \noindent
  1453   The @{const TIMES}-case is slightly more complicated. We first prove the
  1457   The @{const TIMES}-case is slightly more complicated. We first prove the
  1454   following lemma, which will aid the proof about refinement.
  1458   following lemma, which will aid the proof about refinement.
  1551   of the string @{text x}. We have to know that both @{text "x\<^isub>p"} and the
  1555   of the string @{text x}. We have to know that both @{text "x\<^isub>p"} and the
  1552   corresponding partition @{text "y\<^isub>p"} are in @{text "A"}, and that @{text "x\<^isub>s"} is `@{text B}-related' 
  1556   corresponding partition @{text "y\<^isub>p"} are in @{text "A"}, and that @{text "x\<^isub>s"} is `@{text B}-related' 
  1553   to @{text "y\<^isub>s"} @{text "("}@{text "**"}@{text ")"}. From the latter fact we can infer that @{text "y\<^isub>s @ z \<in> B"}.
  1557   to @{text "y\<^isub>s"} @{text "("}@{text "**"}@{text ")"}. From the latter fact we can infer that @{text "y\<^isub>s @ z \<in> B"}.
  1554   This will solve the second case.
  1558   This will solve the second case.
  1555   Taking the two requirements, @{text "("}@{text "*"}@{text ")"} and @{text "(**)"}, together we define the
  1559   Taking the two requirements, @{text "("}@{text "*"}@{text ")"} and @{text "(**)"}, together we define the
  1556   tagging-function in the @{const Times}-case as:
  1560   tagging-function in the @{const Times}-case as (see Fig.~\ref{tagfig}):
  1557 
  1561 
  1558   \begin{center}
  1562   \begin{center}
  1559   @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}\;@{text "x"}~@{text "\<equiv>"}~
  1563   @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}\;@{text "x"}~@{text "\<equiv>"}~
  1560   @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"}
  1564   @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"}
  1561   \end{center}
  1565   \end{center}
  1606   such that @{term "y\<^isub>p \<in> A"} and @{term "\<approx>B `` {x\<^isub>s} = \<approx>B ``
  1610   such that @{term "y\<^isub>p \<in> A"} and @{term "\<approx>B `` {x\<^isub>s} = \<approx>B ``
  1607   {y\<^isub>s}"}. Unfolding the Myhill-Nerode Relation and together with the
  1611   {y\<^isub>s}"}. Unfolding the Myhill-Nerode Relation and together with the
  1608   facts that @{text "x\<^isub>p \<in> A"} and \mbox{@{text "x\<^isub>s @ z \<in> B"}}, we
  1612   facts that @{text "x\<^isub>p \<in> A"} and \mbox{@{text "x\<^isub>s @ z \<in> B"}}, we
  1609   obtain @{term "y\<^isub>p \<in> A"} and @{text "y\<^isub>s @ z \<in> B"}, as needed in
  1613   obtain @{term "y\<^isub>p \<in> A"} and @{text "y\<^isub>s @ z \<in> B"}, as needed in
  1610   this case.  We again can complete the @{const TIMES}-case by setting @{text
  1614   this case.  We again can complete the @{const TIMES}-case by setting @{text
  1611   A} to @{term "lang r\<^isub>1"} and @{text B} to @{term "lang r\<^isub>2"}.
  1615   A} to @{term "lang r\<^isub>1"} and @{text B} to @{term "lang r\<^isub>2"}.\qed
  1612   \end{proof}
  1616   \end{proof}
  1613 
  1617 
  1614   \noindent
  1618   \noindent
  1615   The case for @{const Star} is similar to @{const TIMES}, but poses a few
  1619   The case for @{const Star} is similar to @{const TIMES}, but poses a few
  1616   extra challenges.  To deal with them, we define first the notion of a \emph{string
  1620   extra challenges.  To deal with them, we define first the notion of a \emph{string
  1721   "y\<^isub>s"} with @{term "y\<^isub>p \<in> A\<star>"} and also @{term "x\<^isub>s \<approx>A
  1725   "y\<^isub>s"} with @{term "y\<^isub>p \<in> A\<star>"} and also @{term "x\<^isub>s \<approx>A
  1722   y\<^isub>s"}. Unfolding the Myhill-Nerode Relation we know @{term
  1726   y\<^isub>s"}. Unfolding the Myhill-Nerode Relation we know @{term
  1723   "y\<^isub>s @ z\<^isub>a \<in> A"}. We also know that @{term "z\<^isub>b \<in> A\<star>"}.
  1727   "y\<^isub>s @ z\<^isub>a \<in> A"}. We also know that @{term "z\<^isub>b \<in> A\<star>"}.
  1724   Therefore @{term "y\<^isub>p @ (y\<^isub>s @ z\<^isub>a) @ z\<^isub>b \<in>
  1728   Therefore @{term "y\<^isub>p @ (y\<^isub>s @ z\<^isub>a) @ z\<^isub>b \<in>
  1725   A\<star>"}, which means @{term "y @ z \<in> A\<star>"}. The last step is to set
  1729   A\<star>"}, which means @{term "y @ z \<in> A\<star>"}. The last step is to set
  1726   @{text "A"} to @{term "lang r"} and thus complete the proof.
  1730   @{text "A"} to @{term "lang r"} and thus complete the proof.\qed
  1727   \end{proof}
  1731   \end{proof}
  1728 
  1732 
  1729   \begin{rmk}
  1733   \begin{rmk}
  1730   While our proof using tagging functions might seem like a rabbit pulled 
  1734   While our proof using tagging functions might seem like a rabbit pulled 
  1731   out of a hat, the intuition behind can be rationalised taking the 
  1735   out of a hat, the intuition behind can be somewhat rationalised taking the 
  1732   view that the equivalence classes @{term "UNIV // \<approx>(lang r)"} stand for the 
  1736   view that the equivalence classes @{term "UNIV // \<approx>(lang r)"} stand for the 
  1733   states of the minimal automaton for the language @{term "lang r"}. The theorem 
  1737   states of the minimal automaton for the language @{term "lang r"}. The theorem 
  1734   amounts to showing that this minimal automaton has finitely many states. 
  1738   amounts to showing that this minimal automaton has finitely many states. 
  1735   However, by showing that our @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} relation 
  1739   However, by showing that our @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} relation 
  1736   refines @{term "\<approx>A"} we do not actually have to show that the minimal automata
  1740   refines @{term "\<approx>A"} we do not actually have to show that the minimal automata
  1769   @{text "\<times>tag"} function (see second clause in Fig.~\ref{tagfig}).
  1773   @{text "\<times>tag"} function (see second clause in Fig.~\ref{tagfig}).
  1770   A state of this sequentially composed automaton is accepting, if the first 
  1774   A state of this sequentially composed automaton is accepting, if the first 
  1771   component is accepting and at least one state in the set is also accepting.
  1775   component is accepting and at least one state in the set is also accepting.
  1772 
  1776 
  1773   The idea behind the @{text "STAR"}-case is similar to the @{text "TIMES"}-case.
  1777   The idea behind the @{text "STAR"}-case is similar to the @{text "TIMES"}-case.
  1774   We assume some automaton has consumed some strictly smaller part of the input;
  1778   We assume some automaton has consumed some strictly smaller part of the input in @{text "A\<^isup>\<star>"};
  1775   we need to check that from the state we ended up in a terminal state in the 
  1779   we need to check that from the state we ended up in a terminal state in the 
  1776   automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"}. Since we do not know from which state this will 
  1780   automaton @{text "\<lbrakk>_\<rbrakk>\<^bsub>\<approx>A\<^esub>"} can be reached. Since we do not know from which state this will 
  1777   succeed, we need to run the automaton from all possible states we could have 
  1781   succeed, we need to run the automaton from all possible states we could have 
  1778   ended up in. Therefore the @{text "\<star>tag"} function generates a set of states.
  1782   ended up in. Therefore the @{text "\<star>tag"} function generates again a set of states.
       
  1783 
       
  1784   However, note that while the automata view sheds some light behind the idea of the 
       
  1785   tagging functions, our proof only works because we can perform a structural 
       
  1786   induction on the regular expression @{text r}.
  1779   \end{rmk}
  1787   \end{rmk}
  1780 *}
  1788 *}
  1781 
  1789 
  1782 section {* Second Part proved using Partial Derivatives *}
  1790 section {* Second Part proved using Partial Derivatives *}
  1783 
  1791 
  2042   \end{center}
  2050   \end{center}
  2043   
  2051   
  2044   \noindent
  2052   \noindent
  2045   Note that in order to apply the induction hypothesis in the fourth equation, we
  2053   Note that in order to apply the induction hypothesis in the fourth equation, we
  2046   need the generalisation over all regular expressions @{text r}. The case for
  2054   need the generalisation over all regular expressions @{text r}. The case for
  2047   the empty string is routine and omitted.
  2055   the empty string is routine and omitted.\qed
  2048   \end{proof}
  2056   \end{proof}
  2049 
  2057 
  2050   \noindent
  2058   \noindent
  2051   Taking \eqref{Dersders} and \eqref{Derspders} together gives the relationship 
  2059   Taking \eqref{Dersders} and \eqref{Derspders} together gives the relationship 
  2052   between languages of derivatives and partial derivatives
  2060   between languages of derivatives and partial derivatives
  2140   \noindent
  2148   \noindent
  2141   Now the range of @{term "\<lambda>x. pderivs x r"} is a subset of @{term "Pow (pderivs_lang UNIV r)"},
  2149   Now the range of @{term "\<lambda>x. pderivs x r"} is a subset of @{term "Pow (pderivs_lang UNIV r)"},
  2142   which we know is finite by Theorem~\ref{antimirov}. Consequently there 
  2150   which we know is finite by Theorem~\ref{antimirov}. Consequently there 
  2143   are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. pders x r)\<^esub>"}.
  2151   are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. pders x r)\<^esub>"}.
  2144   This relation refines @{term "\<approx>(lang r)"}, and therefore we can again conclude the 
  2152   This relation refines @{term "\<approx>(lang r)"}, and therefore we can again conclude the 
  2145   second part of the Myhill-Nerode Theorem.
  2153   second part of the Myhill-Nerode Theorem.\qed
  2146   \end{proof}
  2154   \end{proof}
  2147 *}
  2155 *}
  2148 
  2156 
  2149 section {* Closure Properties of Regular Languages *}
  2157 section {* Closure Properties of Regular Languages *}
  2150 
  2158 
  2344   \end{tabular}
  2352   \end{tabular}
  2345   \end{center}
  2353   \end{center}
  2346 
  2354 
  2347   \noindent
  2355   \noindent
  2348   and use \eqref{supseqprops} to establish that @{thm lang_UP} holds. This shows
  2356   and use \eqref{supseqprops} to establish that @{thm lang_UP} holds. This shows
  2349   that @{term "SUPSEQ A"} is regular, provided @{text A} is.
  2357   that @{term "SUPSEQ A"} is regular, provided @{text A} is.\qed
  2350   \end{proof}
  2358   \end{proof}
  2351 
  2359 
  2352   \noindent
  2360   \noindent
  2353   Now we can prove the main lemma w.r.t.~@{const "SUPSEQ"}, namely
  2361   Now we can prove the main lemma w.r.t.~@{const "SUPSEQ"}, namely
  2354 
  2362 
  2379   and hence by transitivity also \mbox{@{term "z \<preceq> x"}} (here we deviate from the argument 
  2387   and hence by transitivity also \mbox{@{term "z \<preceq> x"}} (here we deviate from the argument 
  2380   given by \citeN{Shallit08}, because Isabelle/HOL provides already an extensive infrastructure
  2388   given by \citeN{Shallit08}, because Isabelle/HOL provides already an extensive infrastructure
  2381   for reasoning about well-foundedness). Since @{term "z"} is
  2389   for reasoning about well-foundedness). Since @{term "z"} is
  2382   minimal and an element in @{term A}, we also know that @{term z} is in @{term M}.
  2390   minimal and an element in @{term A}, we also know that @{term z} is in @{term M}.
  2383   From this together with \mbox{@{term "z \<preceq> x"}}, we can infer that @{term x} is in 
  2391   From this together with \mbox{@{term "z \<preceq> x"}}, we can infer that @{term x} is in 
  2384   @{term "SUPSEQ M"}, as required.
  2392   @{term "SUPSEQ M"}, as required.\qed
  2385   \end{proof}
  2393   \end{proof}
  2386 
  2394 
  2387   \noindent
  2395   \noindent
  2388   This lemma allows us to establish the second part of Theorem~\ref{subseqreg}.
  2396   This lemma allows us to establish the second part of Theorem~\ref{subseqreg}.
  2389 
  2397 
  2390   \begin{proof}[of the Second Part of Theorem~\ref{subseqreg}]
  2398   \begin{proof}[of the Second Part of Theorem~\ref{subseqreg}]
  2391   Given any language @{text A}, by Lemma~\ref{mset} we know there exists
  2399   Given any language @{text A}, by Lemma~\ref{mset} we know there exists
  2392   a finite, and thus regular, language @{text M}. We further have @{term "SUPSEQ M = SUPSEQ A"},
  2400   a finite, and thus regular, language @{text M}. We further have @{term "SUPSEQ M = SUPSEQ A"},
  2393   which establishes the second part.    
  2401   which establishes the second part.\qed    
  2394   \end{proof}
  2402   \end{proof}
  2395 
  2403 
  2396   \noindent
  2404   \noindent
  2397   In order to establish the first part of this theorem, we use the
  2405   In order to establish the first part of this theorem, we use the
  2398   property proved by \citeN{Shallit08}, namely that
  2406   property proved by \citeN{Shallit08}, namely that
  2406 
  2414 
  2407   \begin{proof}[of the First Part of Theorem~\ref{subseqreg}]
  2415   \begin{proof}[of the First Part of Theorem~\ref{subseqreg}]
  2408   By the second part, we know the right-hand side of \eqref{compl}
  2416   By the second part, we know the right-hand side of \eqref{compl}
  2409   is regular, which means @{term "- SUBSEQ A"} is regular. But since
  2417   is regular, which means @{term "- SUBSEQ A"} is regular. But since
  2410   we established already that regularity is preserved under complement (using Myhill-Nerode), 
  2418   we established already that regularity is preserved under complement (using Myhill-Nerode), 
  2411   also @{term "SUBSEQ A"} must be regular. 
  2419   also @{term "SUBSEQ A"} must be regular.\qed 
  2412   \end{proof}
  2420   \end{proof}
  2413 
  2421 
  2414   Finally we like to show that the Myhill-Nerode Theorem is also convenient for establishing 
  2422   Finally we like to show that the Myhill-Nerode Theorem is also convenient for establishing 
  2415   the non-regularity of languages. For this we use the following version of the Continuation
  2423   the non-regularity of languages. For this we use the following version of the Continuation
  2416   Lemma (see for example~\cite{Rosenberg06}).
  2424   Lemma (see for example~\cite{Rosenberg06}).
  2441   After unfolding the definition of @{text "B"}, we need to establish that given @{term "i \<noteq> j"},
  2449   After unfolding the definition of @{text "B"}, we need to establish that given @{term "i \<noteq> j"},
  2442   the strings @{text "a\<^sup>i"} and @{text "a\<^sup>j"} are not Myhill-Nerode related by @{text "A"}.
  2450   the strings @{text "a\<^sup>i"} and @{text "a\<^sup>j"} are not Myhill-Nerode related by @{text "A"}.
  2443   That means we have to show that @{text "\<forall>z. a\<^sup>i @ z \<in> A = a\<^sup>j @ z \<in> A"} leads to 
  2451   That means we have to show that @{text "\<forall>z. a\<^sup>i @ z \<in> A = a\<^sup>j @ z \<in> A"} leads to 
  2444   a contradiction. Let us take @{text "b\<^sup>i"} for @{text "z"}. Then we know @{text "a\<^sup>i @ b\<^sup>i \<in> A"}.
  2452   a contradiction. Let us take @{text "b\<^sup>i"} for @{text "z"}. Then we know @{text "a\<^sup>i @ b\<^sup>i \<in> A"}.
  2445   But since @{term "i \<noteq> j"}, @{text "a\<^sup>j @ b\<^sup>i \<notin> A"}. Therefore  @{text "a\<^sup>i"} and @{text "a\<^sup>j"}
  2453   But since @{term "i \<noteq> j"}, @{text "a\<^sup>j @ b\<^sup>i \<notin> A"}. Therefore  @{text "a\<^sup>i"} and @{text "a\<^sup>j"}
  2446   cannot be Myhill-Nerode related by @{text "A"}, and we are done.
  2454   cannot be Myhill-Nerode related by @{text "A"}, and we are done.\qed
  2447   \end{proof}
  2455   \end{proof}
  2448 
  2456 
  2449   \noindent
  2457   \noindent
  2450   To conclude the proof of non-regularity for the language @{text A}, the
  2458   To conclude the proof of non-regularity for the language @{text A}, the
  2451   Continuation Lemma and the lemma above lead to a contradiction assuming
  2459   Continuation Lemma and the lemma above lead to a contradiction assuming
  2459 
  2467 
  2460 text {*
  2468 text {*
  2461   \noindent
  2469   \noindent
  2462   In this paper we took the view that a regular language is one where there
  2470   In this paper we took the view that a regular language is one where there
  2463   exists a regular expression that matches all of its strings. Regular
  2471   exists a regular expression that matches all of its strings. Regular
  2464   expressions can conveniently be defined as a datatype in theorem
  2472   expressions can be conveniently defined as a datatype in theorem
  2465   provers. For us it was therefore interesting to find out how far we can push
  2473   provers. For us it was therefore interesting to find out how far we can push
  2466   this point of view. But this question whether regular language theory can
  2474   this point of view. But this question whether regular language theory can
  2467   be done without automata crops up also in non-theorem prover contexts. Recall 
  2475   be done without automata crops up also in non-theorem prover contexts. Recall 
  2468   Gasarch's speculation cited in the Introduction. 
  2476   Gasarch's comment cited in the Introduction. 
  2469   We have established in Isabelle/HOL both directions 
  2477   We have established in Isabelle/HOL both directions 
  2470   of the Myhill-Nerode Theorem.
  2478   of the Myhill-Nerode Theorem.
  2471   %
  2479   %
  2472   \begin{theorem}[Myhill-Nerode Theorem]\mbox{}\\
  2480   \begin{theorem}[Myhill-Nerode Theorem]\mbox{}\\
  2473   A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
  2481   A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
  2479   regular---by establishing that it has infinitely many equivalence classes
  2487   regular---by establishing that it has infinitely many equivalence classes
  2480   generated by the Myhill-Nerode Relation (this is usually the purpose of the
  2488   generated by the Myhill-Nerode Relation (this is usually the purpose of the
  2481   Pumping Lemma).  We can also use it to establish the standard
  2489   Pumping Lemma).  We can also use it to establish the standard
  2482   textbook results about closure properties of regular languages. The case of 
  2490   textbook results about closure properties of regular languages. The case of 
  2483   closure under complement follows easily from the Myhill-Nerode Theorem.
  2491   closure under complement follows easily from the Myhill-Nerode Theorem.
  2484   So our answer to Gasarch is ``{\it yes we can!''}  
  2492   So our answer to Gasarch is ``{\it yes we can''}!  
  2485 
  2493 
  2486   %Our insistence on regular expressions for proving the Myhill-Nerode Theorem
  2494   %Our insistence on regular expressions for proving the Myhill-Nerode Theorem
  2487   %arose from the problem of defining formally the regularity of a language.
  2495   %arose from the problem of defining formally the regularity of a language.
  2488   %In order to guarantee consistency,
  2496   %In order to guarantee consistency,
  2489   %formalisations in HOL can only extend the logic with definitions that introduce a new concept in
  2497   %formalisations in HOL can only extend the logic with definitions that introduce a new concept in