141 @{term "UPLUS A\<^isub>1 A\<^isub>2 \<equiv> {(1, x) | x. x \<in> A\<^isub>1} \<union> {(2, y) | y. y \<in> A\<^isub>2}"} |
141 @{term "UPLUS A\<^isub>1 A\<^isub>2 \<equiv> {(1, x) | x. x \<in> A\<^isub>1} \<union> {(2, y) | y. y \<in> A\<^isub>2}"} |
142 \end{equation} |
142 \end{equation} |
143 |
143 |
144 \noindent |
144 \noindent |
145 changes the type---the disjoint union is not a set, but a set of pairs. |
145 changes the type---the disjoint union is not a set, but a set of pairs. |
146 Using this definition for disjoint unions means we do not have a single type for automata |
146 Using this definition for disjoint union means we do not have a single type for automata |
147 and hence will not be able to state certain properties about \emph{all} |
147 and hence will not be able to state certain properties about \emph{all} |
148 automata, since there is no type quantification available in HOL. An |
148 automata, since there is no type quantification available in HOL. An |
149 alternative, which provides us with a single type for automata, is to give every |
149 alternative, which provides us with a single type for automata, is to give every |
150 state node an identity, for example a natural |
150 state node an identity, for example a natural |
151 number, and then be careful to rename these identities apart whenever |
151 number, and then be careful to rename these identities apart whenever |
226 {\bf Contributions:} |
226 {\bf Contributions:} |
227 There is an extensive literature on regular languages. |
227 There is an extensive literature on regular languages. |
228 To our knowledge, our proof of the Myhill-Nerode theorem is the |
228 To our knowledge, our proof of the Myhill-Nerode theorem is the |
229 first that is based on regular expressions, only. We prove the part of this theorem |
229 first that is based on regular expressions, only. We prove the part of this theorem |
230 stating that a regular expression has only finitely many partitions using certain |
230 stating that a regular expression has only finitely many partitions using certain |
231 tagging-functions. Again to our best knowledge, these tagging functions have |
231 tagging-functions. Again to our best knowledge, these tagging-functions have |
232 not been used before to establish the Myhill-Nerode theorem. |
232 not been used before to establish the Myhill-Nerode theorem. |
233 *} |
233 *} |
234 |
234 |
235 section {* Preliminaries *} |
235 section {* Preliminaries *} |
236 |
236 |
585 \begin{lemma}\label{ardenable} |
585 \begin{lemma}\label{ardenable} |
586 Given an equation @{text "X = rhs"}. |
586 Given an equation @{text "X = rhs"}. |
587 If @{text "X = \<Union>\<calL> ` rhs"}, |
587 If @{text "X = \<Union>\<calL> ` rhs"}, |
588 @{thm (prem 2) Arden_keeps_eq}, and |
588 @{thm (prem 2) Arden_keeps_eq}, and |
589 @{thm (prem 3) Arden_keeps_eq}, then |
589 @{thm (prem 3) Arden_keeps_eq}, then |
590 @{text "X = \<Union>\<calL> ` (Arden X rhs)"} |
590 @{text "X = \<Union>\<calL> ` (Arden X rhs)"}. |
591 \end{lemma} |
591 \end{lemma} |
592 |
592 |
593 \noindent |
593 \noindent |
594 The \emph{substitution-operation} takes an equation |
594 The \emph{substitution-operation} takes an equation |
595 of the form @{text "X = xrhs"} and substitutes it into the right-hand side @{text rhs}. |
595 of the form @{text "X = xrhs"} and substitutes it into the right-hand side @{text rhs}. |
894 |
894 |
895 \noindent |
895 \noindent |
896 Much more interesting, however, are the inductive cases. They seem hard to be solved |
896 Much more interesting, however, are the inductive cases. They seem hard to be solved |
897 directly. The reader is invited to try. |
897 directly. The reader is invited to try. |
898 |
898 |
899 Our method will rely on some |
899 Our proof will rely on some |
900 \emph{tagging functions} defined over strings. Given the inductive hypothesis, it will |
900 \emph{taggingfunctions} defined over strings. Given the inductive hypothesis, it will |
901 be easy to prove that the range of these tagging functions is finite |
901 be easy to prove that the \emph{range} of these tagging-functions is finite |
902 (the range of a function @{text f} is defined as @{text "range f \<equiv> f ` UNIV"}). |
902 (the range of a function @{text f} is defined as @{text "range f \<equiv> f ` UNIV"}). |
903 With this we will be able to infer that the tagging functions, seen as relations, |
903 With this we will be able to infer that the tagging-functions, seen as relations, |
904 give rise to finitely many equivalence classes of @{const UNIV}. Finally we |
904 give rise to finitely many equivalence classes of @{const UNIV}. Finally we |
905 will show that the tagging relation is more refined than @{term "\<approx>(L r)"}, which |
905 will show that the tagging-relations are more refined than @{term "\<approx>(L r)"}, which |
906 implies that @{term "UNIV // \<approx>(L r)"} must also be finite (a relation @{text "R\<^isub>1"} |
906 implies that @{term "UNIV // \<approx>(L r)"} must also be finite (a relation @{text "R\<^isub>1"} |
907 is said to \emph{refine} @{text "R\<^isub>2"} provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}). |
907 is said to \emph{refine} @{text "R\<^isub>2"} provided @{text "R\<^isub>1 \<subseteq> R\<^isub>2"}). |
908 We formally define the notion of a \emph{tagging-relation} as follows. |
908 We formally define the notion of a \emph{tagging-relation} as follows. |
909 |
909 |
910 \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x} |
910 \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x} |
951 \end{lemma} |
951 \end{lemma} |
952 |
952 |
953 \begin{proof} |
953 \begin{proof} |
954 We prove this lemma again using \eqref{finiteimageD}. This time we set @{text f} to |
954 We prove this lemma again using \eqref{finiteimageD}. This time we set @{text f} to |
955 be @{text "X \<mapsto>"}~@{term "{R\<^isub>1 `` {x} | x. x \<in> X}"}. It is easy to see that |
955 be @{text "X \<mapsto>"}~@{term "{R\<^isub>1 `` {x} | x. x \<in> X}"}. It is easy to see that |
956 @{text "finite (f ` (UNIV // R\<^isub>2))"} because it is a subset of @{text "Pow (UNIV // R\<^isub>1)"}, |
956 @{term "finite (f ` (UNIV // R\<^isub>2))"} because it is a subset of @{term "Pow (UNIV // R\<^isub>1)"}, |
957 which is finite by assumption. What remains to be shown is that @{text f} is injective |
957 which is finite by assumption. What remains to be shown is that @{text f} is injective |
958 on @{term "UNIV // R\<^isub>2"}. This is equivalent to showing that two equivalence |
958 on @{term "UNIV // R\<^isub>2"}. This is equivalent to showing that two equivalence |
959 classes, say @{text "X"} and @{text Y}, in @{term "UNIV // R\<^isub>2"} are equal, provided |
959 classes, say @{text "X"} and @{text Y}, in @{term "UNIV // R\<^isub>2"} are equal, provided |
960 @{text "f X = f Y"}. For @{text "X = Y"} to be equal, we have to find two elements |
960 @{text "f X = f Y"}. For @{text "X = Y"} to be equal, we have to find two elements |
961 @{text "x \<in> X"} and @{text "y \<in> Y"} such that they are @{text R\<^isub>2} related. |
961 @{text "x \<in> X"} and @{text "y \<in> Y"} such that they are @{text R\<^isub>2} related. |
962 We know there exists a @{text x} with @{term "X = R\<^isub>2 `` {x}"} |
962 We know there exists a @{text "x \<in> X"} with \mbox{@{term "X = R\<^isub>2 `` {x}"}}. |
963 and @{text "x \<in> X"}. From the latter fact we can infer that @{term "R\<^isub>1 ``{x} \<in> f X"} |
963 From the latter fact we can infer that @{term "R\<^isub>1 ``{x} \<in> f X"} |
964 and further @{term "R\<^isub>1 ``{x} \<in> f Y"}. This means we can obtain a @{text y} |
964 and further @{term "R\<^isub>1 ``{x} \<in> f Y"}. This means we can obtain a @{text y} |
965 such that @{term "R\<^isub>1 `` {x} = R\<^isub>1 `` {y}"} holds. Consequently @{text x} and @{text y} |
965 such that @{term "R\<^isub>1 `` {x} = R\<^isub>1 `` {y}"} holds. Consequently @{text x} and @{text y} |
966 are @{text "R\<^isub>1"}-related. Since by assumption @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}, |
966 are @{text "R\<^isub>1"}-related. Since by assumption @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}, |
967 they must also be @{text "R\<^isub>2"}-related, as we need to show.\qed |
967 they must also be @{text "R\<^isub>2"}-related, as we need to show.\qed |
968 \end{proof} |
968 \end{proof} |
969 |
969 |
970 \noindent |
970 \noindent |
971 Chaining Lem.~\ref{finone} and \ref{fintwo} together, means in order to show |
971 Chaining Lem.~\ref{finone} and \ref{fintwo} together, means in order to show |
972 that @{term "UNIV // \<approx>(L r)"} is finite, we have to find a tagging function whose |
972 that @{term "UNIV // \<approx>(L r)"} is finite, we have to find a tagging-function whose |
973 range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(L r)"}. |
973 range can be shown to be finite and whose tagging-relation refines @{term "\<approx>(L r)"}. |
974 Let us attempt the @{const ALT}-case first. |
974 Let us attempt the @{const ALT}-case first. |
975 |
975 |
976 \begin{proof}[@{const "ALT"}-Case] |
976 \begin{proof}[@{const "ALT"}-Case] |
977 We take as tagging function |
977 We take as tagging-function |
978 % |
978 % |
979 \begin{center} |
979 \begin{center} |
980 @{thm tag_str_ALT_def[where A="A" and B="B", THEN meta_eq_app]} |
980 @{thm tag_str_ALT_def[where A="A" and B="B", THEN meta_eq_app]} |
981 \end{center} |
981 \end{center} |
982 |
982 |
1014 The pattern in \eqref{pattern} is repeated for the other two cases. Unfortunately, |
1014 The pattern in \eqref{pattern} is repeated for the other two cases. Unfortunately, |
1015 they are slightly more complicated. In the @{const SEQ}-case we essentially have |
1015 they are slightly more complicated. In the @{const SEQ}-case we essentially have |
1016 to be able to infer that |
1016 to be able to infer that |
1017 % |
1017 % |
1018 \begin{center} |
1018 \begin{center} |
1019 @{term "x @ z \<in> A ;; B \<longrightarrow> y @ z \<in> A ;; B"} |
1019 @{text "\<dots>"}@{term "x @ z \<in> A ;; B \<longrightarrow> y @ z \<in> A ;; B"} |
1020 \end{center} |
1020 \end{center} |
1021 % |
1021 % |
1022 \noindent |
1022 \noindent |
1023 using the information given by the appropriate tagging function. The complication |
1023 using the information given by the appropriate tagging-function. The complication |
1024 is to find out what the possible splits of @{text "x @ z"} are to be in @{term "A ;; B"} |
1024 is to find out what the possible splits of @{text "x @ z"} are to be in @{term "A ;; B"} |
1025 (this was easy in case of @{term "A \<union> B"}). For this we define the |
1025 (this was easy in case of @{term "A \<union> B"}). To deal with this complication we define the |
1026 notions of \emph{string prefixes} |
1026 notions of \emph{string prefixes} |
1027 % |
1027 % |
1028 \begin{center} |
1028 \begin{center} |
1029 @{text "x \<le> y \<equiv> \<exists>z. y = x @ z"}\hspace{10mm} |
1029 @{text "x \<le> y \<equiv> \<exists>z. y = x @ z"}\hspace{10mm} |
1030 @{text "x < y \<equiv> x \<le> y \<and> x \<noteq> y"} |
1030 @{text "x < y \<equiv> x \<le> y \<and> x \<noteq> y"} |
1158 with @{term "y @ z \<in> A ;; B"}. We again can complete the @{const SEQ}-case |
1158 with @{term "y @ z \<in> A ;; B"}. We again can complete the @{const SEQ}-case |
1159 by setting @{text A} to @{term "L r\<^isub>1"} and @{text B} to @{term "L r\<^isub>2"}.\qed |
1159 by setting @{text A} to @{term "L r\<^isub>1"} and @{text B} to @{term "L r\<^isub>2"}.\qed |
1160 \end{proof} |
1160 \end{proof} |
1161 |
1161 |
1162 \noindent |
1162 \noindent |
1163 The case for @{const STAR} is similar as @{const SEQ}, but poses a few extra challenges. When |
1163 The case for @{const STAR} is similar to @{const SEQ}, but poses a few extra challenges. When |
1164 we analyse the case that @{text "x @ z"} is an element in @{text "A\<star>"} and @{text x} is not the |
1164 we analyse the case that @{text "x @ z"} is an element in @{text "A\<star>"} and @{text x} is not the |
1165 empty string, we |
1165 empty string, we |
1166 have the following picture: |
1166 have the following picture: |
1167 % |
1167 % |
1168 \begin{center} |
1168 \begin{center} |
1202 node[midway, below=0.5em]{@{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z \<in> A\<star>"}}; |
1202 node[midway, below=0.5em]{@{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z \<in> A\<star>"}}; |
1203 \end{tikzpicture}} |
1203 \end{tikzpicture}} |
1204 \end{center} |
1204 \end{center} |
1205 % |
1205 % |
1206 \noindent |
1206 \noindent |
1207 We can find a strict prefix @{text "x'"} of @{text x} such that @{text "x' \<in> A\<star>"}, |
1207 We can find a strict prefix @{text "x'"} of @{text x} such that @{term "x' \<in> A\<star>"}, |
1208 @{text "x' < x"} and the rest @{text "(x - x') @ z \<in> A\<star>"}. For example the empty string |
1208 @{text "x' < x"} and the rest @{term "(x - x') @ z \<in> A\<star>"}. For example the empty string |
1209 @{text "[]"} would do. |
1209 @{text "[]"} would do. |
1210 There are many such prefixes, but there can only be finitely many of them (the |
1210 There are potentially many such prefixes, but there can only be finitely many of them (the |
1211 string @{text x} is finite). Let us therefore choose the longest one and call it |
1211 string @{text x} is finite). Let us therefore choose the longest one and call it |
1212 @{text "x'\<^isub>m\<^isub>a\<^isub>x"}. Now for the rest of the string @{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z"} we |
1212 @{text "x'\<^isub>m\<^isub>a\<^isub>x"}. Now for the rest of the string @{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z"} we |
1213 know it is in @{text "A\<star>"}. By definition of @{text "A\<star>"}, we can separate |
1213 know it is in @{term "A\<star>"}. By definition of @{term "A\<star>"}, we can separate |
1214 this string into two parts, say @{text "a"} and @{text "b"}, such @{text "a \<in> A"} |
1214 this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \<in> A"} |
1215 and @{text "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x - x'\<^isub>m\<^isub>a\<^isub>x"}, |
1215 and @{term "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x - x'\<^isub>m\<^isub>a\<^isub>x"}, |
1216 otherwise @{text "x'\<^isub>m\<^isub>a\<^isub>x"} is not the longest prefix. That means @{text a} |
1216 otherwise @{text "x'\<^isub>m\<^isub>a\<^isub>x"} is not the longest prefix. That means @{text a} |
1217 `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and |
1217 `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and |
1218 @{text "z\<^isub>b"}. For this we know that @{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z\<^isub>a \<in> A"} and |
1218 @{text "z\<^isub>b"}. For this we know that @{text "(x - x'\<^isub>m\<^isub>a\<^isub>x) @ z\<^isub>a \<in> A"} and |
1219 @{text "z\<^isub>b \<in> A\<star>"}. To cut a story short, we have divided @{text "x @ z \<in> A\<star>"} |
1219 @{term "z\<^isub>b \<in> A\<star>"}. To cut a story short, we have divided @{term "x @ z \<in> A\<star>"} |
1220 such that we have a string @{text a} with @{text "a \<in> A"} that lies just on the |
1220 such that we have a string @{text a} with @{text "a \<in> A"} that lies just on the |
1221 `border' of @{text x} and @{text z}. This string is @{text "(x - x') @ z\<^isub>a"}. |
1221 `border' of @{text x} and @{text z}. This string is @{text "(x - x') @ z\<^isub>a"}. |
1222 |
1222 |
1223 In order to show that @{text "x @ z \<in> A\<star>"} implies @{text "y @ z \<in> A\<star>"}, we use |
1223 In order to show that @{term "x @ z \<in> A\<star>"} implies @{term "y @ z \<in> A\<star>"}, we use |
1224 the following tagging-function: |
1224 the following tagging-function: |
1225 % |
1225 % |
1226 \begin{center} |
1226 \begin{center} |
1227 @{thm tag_str_STAR_def[where ?L1.0="A", THEN meta_eq_app]}\smallskip |
1227 @{thm tag_str_STAR_def[where ?L1.0="A", THEN meta_eq_app]}\smallskip |
1228 \end{center} |
1228 \end{center} |
1238 \end{center} |
1238 \end{center} |
1239 % |
1239 % |
1240 \noindent |
1240 \noindent |
1241 We first need to consider the case that @{text x} is the empty string. |
1241 We first need to consider the case that @{text x} is the empty string. |
1242 From the assumption we can infer @{text y} is the empty string and |
1242 From the assumption we can infer @{text y} is the empty string and |
1243 clearly have @{text "y @ z \<in> A\<star>"}. In case @{text x} is not the empty |
1243 clearly have @{term "y @ z \<in> A\<star>"}. In case @{text x} is not the empty |
1244 string, we can divide the string @{text "x @ z"} as shown in the picture |
1244 string, we can divide the string @{text "x @ z"} as shown in the picture |
1245 above. By the tagging function we have |
1245 above. By the tagging-function we have |
1246 % |
1246 % |
1247 \begin{center} |
1247 \begin{center} |
1248 @{term "\<approx>A `` {(x - x'\<^isub>m\<^isub>a\<^isub>x)} \<in> ({\<approx>A `` {x - x'} |x'. x' < x \<and> x' \<in> A\<star>})"} |
1248 @{term "\<approx>A `` {(x - x'\<^isub>m\<^isub>a\<^isub>x)} \<in> ({\<approx>A `` {x - x'} |x'. x' < x \<and> x' \<in> A\<star>})"} |
1249 \end{center} |
1249 \end{center} |
1250 % |
1250 % |
1254 \begin{center} |
1254 \begin{center} |
1255 @{term "\<approx>A `` {(x - x'\<^isub>m\<^isub>a\<^isub>x)} \<in> ({\<approx>A `` {y - y'} |y'. y' < y \<and> y' \<in> A\<star>})"} |
1255 @{term "\<approx>A `` {(x - x'\<^isub>m\<^isub>a\<^isub>x)} \<in> ({\<approx>A `` {y - y'} |y'. y' < y \<and> y' \<in> A\<star>})"} |
1256 \end{center} |
1256 \end{center} |
1257 % |
1257 % |
1258 \noindent |
1258 \noindent |
1259 and we know that we have a @{text "y' \<in> A\<star>"} and @{text "y' < y"} |
1259 and we know that we have a @{term "y' \<in> A\<star>"} and @{text "y' < y"} |
1260 and also know @{term "(x - x'\<^isub>m\<^isub>a\<^isub>x) \<approx>A (y - y')"}. Unfolding the Myhill-Nerode |
1260 and also know @{term "(x - x'\<^isub>m\<^isub>a\<^isub>x) \<approx>A (y - y')"}. Unfolding the Myhill-Nerode |
1261 relation we know @{term "(y - y') @ z\<^isub>a \<in> A"}. We also know that @{text "z\<^isub>b \<in> A\<star>"}. |
1261 relation we know @{term "(y - y') @ z\<^isub>a \<in> A"}. We also know that @{term "z\<^isub>b \<in> A\<star>"}. |
1262 Therefore, finally, @{text "y' @ ((y - y') @ z\<^isub>a) @ z\<^isub>b \<in> A\<star>"}, which means |
1262 Therefore @{term "y' @ ((y - y') @ z\<^isub>a) @ z\<^isub>b \<in> A\<star>"}, which means |
1263 @{term "y @ z \<in> A\<star>"}. As the last step we have to set @{text "A"} to @{text "L r"} and |
1263 @{term "y @ z \<in> A\<star>"}. As the last step we have to set @{text "A"} to @{term "L r"} and |
1264 complete the proof.\qed |
1264 complete the proof.\qed |
1265 \end{proof} |
1265 \end{proof} |
1266 *} |
1266 *} |
1267 |
1267 |
1268 |
1268 |