Paper/Paper.thy
changeset 135 604518f0127f
parent 134 08afbed1c8c7
child 136 13b0f3dac9a2
equal deleted inserted replaced
134:08afbed1c8c7 135:604518f0127f
   141   @{term "UPLUS A\<^isub>1 A\<^isub>2 \<equiv> {(1, x) | x. x \<in> A\<^isub>1} \<union> {(2, y) | y. y \<in> A\<^isub>2}"}
   141   @{term "UPLUS A\<^isub>1 A\<^isub>2 \<equiv> {(1, x) | x. x \<in> A\<^isub>1} \<union> {(2, y) | y. y \<in> A\<^isub>2}"}
   142   \end{equation}
   142   \end{equation}
   143 
   143 
   144   \noindent
   144   \noindent
   145   changes the type---the disjoint union is not a set, but a set of pairs. 
   145   changes the type---the disjoint union is not a set, but a set of pairs. 
   146   Using this definition for disjoint unions means we do not have a single type for automata
   146   Using this definition for disjoint union means we do not have a single type for automata
   147   and hence will not be able to state certain properties about \emph{all}
   147   and hence will not be able to state certain properties about \emph{all}
   148   automata, since there is no type quantification available in HOL. An
   148   automata, since there is no type quantification available in HOL. An
   149   alternative, which provides us with a single type for automata, is to give every 
   149   alternative, which provides us with a single type for automata, is to give every 
   150   state node an identity, for example a natural
   150   state node an identity, for example a natural
   151   number, and then be careful to rename these identities apart whenever
   151   number, and then be careful to rename these identities apart whenever
   226   {\bf Contributions:} 
   226   {\bf Contributions:} 
   227   There is an extensive literature on regular languages.
   227   There is an extensive literature on regular languages.
   228   To our knowledge, our proof of the Myhill-Nerode theorem is the
   228   To our knowledge, our proof of the Myhill-Nerode theorem is the
   229   first that is based on regular expressions, only. We prove the part of this theorem 
   229   first that is based on regular expressions, only. We prove the part of this theorem 
   230   stating that a regular expression has only finitely many partitions using certain 
   230   stating that a regular expression has only finitely many partitions using certain 
   231   tagging-functions. Again to our best knowledge, these tagging functions have
   231   tagging-functions. Again to our best knowledge, these tagging-functions have
   232   not been used before to establish the Myhill-Nerode theorem.
   232   not been used before to establish the Myhill-Nerode theorem.
   233 *}
   233 *}
   234 
   234 
   235 section {* Preliminaries *}
   235 section {* Preliminaries *}
   236 
   236 
   585   \begin{lemma}\label{ardenable}
   585   \begin{lemma}\label{ardenable}
   586   Given an equation @{text "X = rhs"}.
   586   Given an equation @{text "X = rhs"}.
   587   If @{text "X = \<Union>\<calL> ` rhs"},
   587   If @{text "X = \<Union>\<calL> ` rhs"},
   588   @{thm (prem 2) Arden_keeps_eq}, and
   588   @{thm (prem 2) Arden_keeps_eq}, and
   589   @{thm (prem 3) Arden_keeps_eq}, then
   589   @{thm (prem 3) Arden_keeps_eq}, then
   590   @{text "X = \<Union>\<calL> ` (Arden X rhs)"}
   590   @{text "X = \<Union>\<calL> ` (Arden X rhs)"}.
   591   \end{lemma}
   591   \end{lemma}
   592   
   592   
   593   \noindent
   593   \noindent
   594   The \emph{substitution-operation} takes an equation
   594   The \emph{substitution-operation} takes an equation
   595   of the form @{text "X = xrhs"} and substitutes it into the right-hand side @{text rhs}.
   595   of the form @{text "X = xrhs"} and substitutes it into the right-hand side @{text rhs}.
   866 text {*
   866 text {*
   867   We will prove in this section the second part of the Myhill-Nerode
   867   We will prove in this section the second part of the Myhill-Nerode
   868   theorem. It can be formulated in our setting as follows.
   868   theorem. It can be formulated in our setting as follows.
   869 
   869 
   870   \begin{theorem}
   870   \begin{theorem}
   871   Given @{text "r"} is a regular expressions, then @{thm Myhill_Nerode2}.
   871   Given @{text "r"} is a regular expression, then @{thm Myhill_Nerode2}.
   872   \end{theorem}  
   872   \end{theorem}  
   873 
   873 
   874   \noindent
   874   \noindent
   875   The proof will be by induction on the structure of @{text r}. It turns out
   875   The proof will be by induction on the structure of @{text r}. It turns out
   876   the base cases are straightforward.
   876   the base cases are straightforward.
   894 
   894 
   895   \noindent
   895   \noindent
   896   Much more interesting, however, are the inductive cases. They seem hard to be solved 
   896   Much more interesting, however, are the inductive cases. They seem hard to be solved 
   897   directly. The reader is invited to try. 
   897   directly. The reader is invited to try. 
   898 
   898 
   899   Our method will rely on some
   899   Our proof will rely on some
   900   \emph{tagging functions} defined over strings. Given the inductive hypothesis, it will 
   900   \emph{taggingfunctions} defined over strings. Given the inductive hypothesis, it will 
   901   be easy to prove that the range of these tagging functions is finite
   901   be easy to prove that the \emph{range} of these tagging-functions is finite
   902   (the range of a function @{text f} is defined as @{text "range f \<equiv> f ` UNIV"}).
   902   (the range of a function @{text f} is defined as @{text "range f \<equiv> f ` UNIV"}).
   903   With this we will be able to infer that the tagging functions, seen as relations,
   903   With this we will be able to infer that the tagging-functions, seen as relations,
   904   give rise to finitely many equivalence classes of @{const UNIV}. Finally we 
   904   give rise to finitely many equivalence classes of @{const UNIV}. Finally we 
   905   will show that the tagging relation is more refined than @{term "\<approx>(L r)"}, which
   905   will show that the tagging-relations are more refined than @{term "\<approx>(L r)"}, which
   906   implies that @{term "UNIV // \<approx>(L r)"} must also be finite (a relation @{text "R\<^isub>1"} 
   906   implies that @{term "UNIV // \<approx>(L r)"} must also be finite (a relation @{text "R\<^isub>1"} 
   907   is said to \emph{refine} @{text "R\<^isub>2"} provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}).
   907   is said to \emph{refine} @{text "R\<^isub>2"} provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}).
   908   We formally define the notion of a \emph{tagging-relation} as follows.
   908   We formally define the notion of a \emph{tagging-relation} as follows.
   909 
   909 
   910   \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
   910   \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
   951   \end{lemma}
   951   \end{lemma}
   952 
   952 
   953   \begin{proof}
   953   \begin{proof}
   954   We prove this lemma again using \eqref{finiteimageD}. This time we set @{text f} to
   954   We prove this lemma again using \eqref{finiteimageD}. This time we set @{text f} to
   955   be @{text "X \<mapsto>"}~@{term "{R\<^isub>1 `` {x} | x. x \<in> X}"}. It is easy to see that 
   955   be @{text "X \<mapsto>"}~@{term "{R\<^isub>1 `` {x} | x. x \<in> X}"}. It is easy to see that 
   956   @{text "finite (f ` (UNIV // R\<^isub>2))"} because it is a subset of @{text "Pow (UNIV // R\<^isub>1)"},
   956   @{term "finite (f ` (UNIV // R\<^isub>2))"} because it is a subset of @{term "Pow (UNIV // R\<^isub>1)"},
   957   which is finite by assumption. What remains to be shown is that @{text f} is injective
   957   which is finite by assumption. What remains to be shown is that @{text f} is injective
   958   on @{term "UNIV // R\<^isub>2"}. This is equivalent to showing that two equivalence 
   958   on @{term "UNIV // R\<^isub>2"}. This is equivalent to showing that two equivalence 
   959   classes, say @{text "X"} and @{text Y}, in @{term "UNIV // R\<^isub>2"} are equal, provided
   959   classes, say @{text "X"} and @{text Y}, in @{term "UNIV // R\<^isub>2"} are equal, provided
   960   @{text "f X = f Y"}. For @{text "X = Y"} to be equal, we have to find two elements
   960   @{text "f X = f Y"}. For @{text "X = Y"} to be equal, we have to find two elements
   961   @{text "x \<in> X"} and @{text "y \<in> Y"} such that they are @{text R\<^isub>2} related.
   961   @{text "x \<in> X"} and @{text "y \<in> Y"} such that they are @{text R\<^isub>2} related.
   962   We know there exists a @{text x} with @{term "X = R\<^isub>2 `` {x}"}
   962   We know there exists a @{text "x \<in> X"} with \mbox{@{term "X = R\<^isub>2 `` {x}"}}. 
   963   and @{text "x \<in> X"}. From the latter fact we can infer that @{term "R\<^isub>1 ``{x} \<in> f X"}
   963   From the latter fact we can infer that @{term "R\<^isub>1 ``{x} \<in> f X"}
   964   and further @{term "R\<^isub>1 ``{x} \<in> f Y"}. This means we can obtain a @{text y}
   964   and further @{term "R\<^isub>1 ``{x} \<in> f Y"}. This means we can obtain a @{text y}
   965   such that @{term "R\<^isub>1 `` {x} = R\<^isub>1 `` {y}"} holds. Consequently @{text x} and @{text y}
   965   such that @{term "R\<^isub>1 `` {x} = R\<^isub>1 `` {y}"} holds. Consequently @{text x} and @{text y}
   966   are @{text "R\<^isub>1"}-related. Since by assumption @{text "R\<^isub>1"} refines @{text "R\<^isub>2"},
   966   are @{text "R\<^isub>1"}-related. Since by assumption @{text "R\<^isub>1"} refines @{text "R\<^isub>2"},
   967   they must also be @{text "R\<^isub>2"}-related, as we need to show.\qed
   967   they must also be @{text "R\<^isub>2"}-related, as we need to show.\qed
   968   \end{proof}
   968   \end{proof}
   969 
   969 
   970   \noindent
   970   \noindent
   971   Chaining Lem.~\ref{finone} and \ref{fintwo} together, means in order to show
   971   Chaining Lem.~\ref{finone} and \ref{fintwo} together, means in order to show
   972   that @{term "UNIV // \<approx>(L r)"} is finite, we have to find a tagging function whose
   972   that @{term "UNIV // \<approx>(L r)"} is finite, we have to find a tagging-function whose
   973   range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(L r)"}.
   973   range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(L r)"}.
   974   Let us attempt the @{const ALT}-case first.
   974   Let us attempt the @{const ALT}-case first.
   975 
   975 
   976   \begin{proof}[@{const "ALT"}-Case] 
   976   \begin{proof}[@{const "ALT"}-Case] 
   977   We take as tagging function 
   977   We take as tagging-function 
   978   %
   978   %
   979   \begin{center}
   979   \begin{center}
   980   @{thm tag_str_ALT_def[where A="A" and B="B", THEN meta_eq_app]}
   980   @{thm tag_str_ALT_def[where A="A" and B="B", THEN meta_eq_app]}
   981   \end{center}
   981   \end{center}
   982 
   982 
  1014   The pattern in \eqref{pattern} is repeated for the other two cases. Unfortunately,
  1014   The pattern in \eqref{pattern} is repeated for the other two cases. Unfortunately,
  1015   they are slightly more complicated. In the @{const SEQ}-case we essentially have
  1015   they are slightly more complicated. In the @{const SEQ}-case we essentially have
  1016   to be able to infer that 
  1016   to be able to infer that 
  1017   %
  1017   %
  1018   \begin{center}
  1018   \begin{center}
  1019   @{term "x @ z \<in> A ;; B \<longrightarrow> y @ z \<in> A ;; B"}
  1019   @{text "\<dots>"}@{term "x @ z \<in> A ;; B \<longrightarrow> y @ z \<in> A ;; B"}
  1020   \end{center}
  1020   \end{center}
  1021   %
  1021   %
  1022   \noindent
  1022   \noindent
  1023   using the information given by the appropriate tagging function. The complication 
  1023   using the information given by the appropriate tagging-function. The complication 
  1024   is to find out what the possible splits of @{text "x @ z"} are to be in @{term "A ;; B"}
  1024   is to find out what the possible splits of @{text "x @ z"} are to be in @{term "A ;; B"}
  1025   (this was easy in case of @{term "A \<union> B"}). For this we define the
  1025   (this was easy in case of @{term "A \<union> B"}). To deal with this complication we define the
  1026   notions of \emph{string prefixes} 
  1026   notions of \emph{string prefixes} 
  1027   %
  1027   %
  1028   \begin{center}
  1028   \begin{center}
  1029   @{text "x \<le> y \<equiv> \<exists>z. y = x @ z"}\hspace{10mm}
  1029   @{text "x \<le> y \<equiv> \<exists>z. y = x @ z"}\hspace{10mm}
  1030   @{text "x < y \<equiv> x \<le> y \<and> x \<noteq> y"}
  1030   @{text "x < y \<equiv> x \<le> y \<and> x \<noteq> y"}
  1158   with @{term "y @ z \<in> A ;; B"}. We again can complete the @{const SEQ}-case
  1158   with @{term "y @ z \<in> A ;; B"}. We again can complete the @{const SEQ}-case
  1159   by setting @{text A} to @{term "L r\<^isub>1"} and @{text B} to @{term "L r\<^isub>2"}.\qed 
  1159   by setting @{text A} to @{term "L r\<^isub>1"} and @{text B} to @{term "L r\<^isub>2"}.\qed 
  1160   \end{proof}
  1160   \end{proof}
  1161   
  1161   
  1162   \noindent
  1162   \noindent
  1163   The case for @{const STAR} is similar as @{const SEQ}, but poses a few extra challenges. When
  1163   The case for @{const STAR} is similar to @{const SEQ}, but poses a few extra challenges. When
  1164   we analyse the case that @{text "x @ z"} is an element in @{text "A\<star>"} and @{text x} is not the 
  1164   we analyse the case that @{text "x @ z"} is an element in @{text "A\<star>"} and @{text x} is not the 
  1165   empty string, we 
  1165   empty string, we 
  1166   have the following picture:
  1166   have the following picture:
  1167   %
  1167   %
  1168   \begin{center}
  1168   \begin{center}
  1202                node[midway, below=0.5em]{@{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z \<in> A\<star>"}};
  1202                node[midway, below=0.5em]{@{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z \<in> A\<star>"}};
  1203   \end{tikzpicture}}
  1203   \end{tikzpicture}}
  1204   \end{center}
  1204   \end{center}
  1205   %
  1205   %
  1206   \noindent
  1206   \noindent
  1207   We can find a strict prefix @{text "x'"} of @{text x} such that @{text "x' \<in> A\<star>"},
  1207   We can find a strict prefix @{text "x'"} of @{text x} such that @{term "x' \<in> A\<star>"},
  1208   @{text "x' < x"} and the rest @{text "(x - x') @ z \<in> A\<star>"}. For example the empty string 
  1208   @{text "x' < x"} and the rest @{term "(x - x') @ z \<in> A\<star>"}. For example the empty string 
  1209   @{text "[]"} would do.
  1209   @{text "[]"} would do.
  1210   There are many such prefixes, but there can only be finitely many of them (the 
  1210   There are potentially many such prefixes, but there can only be finitely many of them (the 
  1211   string @{text x} is finite). Let us therefore choose the longest one and call it 
  1211   string @{text x} is finite). Let us therefore choose the longest one and call it 
  1212   @{text "x'\<^isub>m\<^isub>a\<^isub>x"}. Now for the rest of the string @{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z"} we
  1212   @{text "x'\<^isub>m\<^isub>a\<^isub>x"}. Now for the rest of the string @{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z"} we
  1213   know it is in @{text "A\<star>"}. By definition of @{text "A\<star>"}, we can separate
  1213   know it is in @{term "A\<star>"}. By definition of @{term "A\<star>"}, we can separate
  1214   this string into two parts, say @{text "a"} and @{text "b"}, such @{text "a \<in> A"}
  1214   this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \<in> A"}
  1215   and @{text "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x - x'\<^isub>m\<^isub>a\<^isub>x"},
  1215   and @{term "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x - x'\<^isub>m\<^isub>a\<^isub>x"},
  1216   otherwise @{text "x'\<^isub>m\<^isub>a\<^isub>x"} is not the longest prefix. That means @{text a}
  1216   otherwise @{text "x'\<^isub>m\<^isub>a\<^isub>x"} is not the longest prefix. That means @{text a}
  1217   `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and
  1217   `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and
  1218    @{text "z\<^isub>b"}. For this we know that @{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z\<^isub>a \<in> A"} and
  1218    @{text "z\<^isub>b"}. For this we know that @{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z\<^isub>a \<in> A"} and
  1219   @{text "z\<^isub>b \<in> A\<star>"}. To cut a story short, we have divided @{text "x @ z \<in> A\<star>"}
  1219   @{term "z\<^isub>b \<in> A\<star>"}. To cut a story short, we have divided @{term "x @ z \<in> A\<star>"}
  1220   such that we have a string @{text a} with @{text "a \<in> A"} that lies just on the
  1220   such that we have a string @{text a} with @{text "a \<in> A"} that lies just on the
  1221   `border' of @{text x} and @{text z}. This string is @{text "(x - x') @ z\<^isub>a"}.
  1221   `border' of @{text x} and @{text z}. This string is @{text "(x - x') @ z\<^isub>a"}.
  1222 
  1222 
  1223   In order to show that @{text "x @ z \<in> A\<star>"} implies @{text "y @ z \<in> A\<star>"}, we use
  1223   In order to show that @{term "x @ z \<in> A\<star>"} implies @{term "y @ z \<in> A\<star>"}, we use
  1224   the following tagging-function:
  1224   the following tagging-function:
  1225   %
  1225   %
  1226   \begin{center}
  1226   \begin{center}
  1227   @{thm tag_str_STAR_def[where ?L1.0="A", THEN meta_eq_app]}\smallskip
  1227   @{thm tag_str_STAR_def[where ?L1.0="A", THEN meta_eq_app]}\smallskip
  1228   \end{center}
  1228   \end{center}
  1238   \end{center}  
  1238   \end{center}  
  1239   %
  1239   %
  1240   \noindent
  1240   \noindent
  1241   We first need to consider the case that @{text x} is the empty string.
  1241   We first need to consider the case that @{text x} is the empty string.
  1242   From the assumption we can infer @{text y} is the empty string and
  1242   From the assumption we can infer @{text y} is the empty string and
  1243   clearly have @{text "y @ z \<in> A\<star>"}. In case @{text x} is not the empty
  1243   clearly have @{term "y @ z \<in> A\<star>"}. In case @{text x} is not the empty
  1244   string, we can divide the string @{text "x @ z"} as shown in the picture 
  1244   string, we can divide the string @{text "x @ z"} as shown in the picture 
  1245   above. By the tagging function we have
  1245   above. By the tagging-function we have
  1246   %
  1246   %
  1247   \begin{center}
  1247   \begin{center}
  1248   @{term "\<approx>A `` {(x - x'\<^isub>m\<^isub>a\<^isub>x)} \<in> ({\<approx>A `` {x - x'} |x'. x' < x \<and> x' \<in> A\<star>})"}
  1248   @{term "\<approx>A `` {(x - x'\<^isub>m\<^isub>a\<^isub>x)} \<in> ({\<approx>A `` {x - x'} |x'. x' < x \<and> x' \<in> A\<star>})"}
  1249   \end{center}
  1249   \end{center}
  1250   %
  1250   %
  1254   \begin{center}
  1254   \begin{center}
  1255   @{term "\<approx>A `` {(x - x'\<^isub>m\<^isub>a\<^isub>x)} \<in> ({\<approx>A `` {y - y'} |y'. y' < y \<and> y' \<in> A\<star>})"}
  1255   @{term "\<approx>A `` {(x - x'\<^isub>m\<^isub>a\<^isub>x)} \<in> ({\<approx>A `` {y - y'} |y'. y' < y \<and> y' \<in> A\<star>})"}
  1256   \end{center}
  1256   \end{center}
  1257   %
  1257   %
  1258   \noindent 
  1258   \noindent 
  1259   and we know that we have a @{text "y' \<in> A\<star>"} and @{text "y' < y"}
  1259   and we know that we have a @{term "y' \<in> A\<star>"} and @{text "y' < y"}
  1260   and also know @{term "(x - x'\<^isub>m\<^isub>a\<^isub>x) \<approx>A (y - y')"}. Unfolding the Myhill-Nerode
  1260   and also know @{term "(x - x'\<^isub>m\<^isub>a\<^isub>x) \<approx>A (y - y')"}. Unfolding the Myhill-Nerode
  1261   relation we know @{term "(y - y') @ z\<^isub>a \<in> A"}. We also know that @{text "z\<^isub>b \<in> A\<star>"}.
  1261   relation we know @{term "(y - y') @ z\<^isub>a \<in> A"}. We also know that @{term "z\<^isub>b \<in> A\<star>"}.
  1262   Therefore, finally, @{text "y' @ ((y - y') @ z\<^isub>a) @ z\<^isub>b \<in> A\<star>"}, which means
  1262   Therefore @{term "y' @ ((y - y') @ z\<^isub>a) @ z\<^isub>b \<in> A\<star>"}, which means
  1263   @{term "y @ z \<in> A\<star>"}. As the last step we have to set @{text "A"} to @{text "L r"} and
  1263   @{term "y @ z \<in> A\<star>"}. As the last step we have to set @{text "A"} to @{term "L r"} and
  1264   complete the proof.\qed
  1264   complete the proof.\qed
  1265   \end{proof}
  1265   \end{proof}
  1266 *}
  1266 *}
  1267 
  1267 
  1268 
  1268