74 In case of graphs and matrices, this means we have to build our own |
74 In case of graphs and matrices, this means we have to build our own |
75 reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor |
75 reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor |
76 HOLlight support them with libraries. Even worse, reasoning about graphs and |
76 HOLlight support them with libraries. Even worse, reasoning about graphs and |
77 matrices can be a real hassle in HOL-based theorem provers. Consider for |
77 matrices can be a real hassle in HOL-based theorem provers. Consider for |
78 example the operation of sequencing two automata, say $A_1$ and $A_2$, by |
78 example the operation of sequencing two automata, say $A_1$ and $A_2$, by |
79 connecting the accepting states of $A_1$ to the initial state of $A_2$: |
79 connecting the accepting states of $A_1$ to the initial state of $A_2$:\\[-5.5mm] |
80 |
80 % |
81 \begin{center} |
81 \begin{center} |
82 \begin{tabular}{ccc} |
82 \begin{tabular}{ccc} |
83 \begin{tikzpicture}[scale=0.8] |
83 \begin{tikzpicture}[scale=0.8] |
84 %\draw[step=2mm] (-1,-1) grid (1,1); |
84 %\draw[step=2mm] (-1,-1) grid (1,1); |
85 |
85 |
197 bit strings in the context of Presburger arithmetic. |
197 bit strings in the context of Presburger arithmetic. |
198 The only larger formalisations of automata theory |
198 The only larger formalisations of automata theory |
199 are carried out in Nuprl \cite{Constable00} and in Coq \cite{Filliatre97}. |
199 are carried out in Nuprl \cite{Constable00} and in Coq \cite{Filliatre97}. |
200 |
200 |
201 In this paper, we will not attempt to formalise automata theory in |
201 In this paper, we will not attempt to formalise automata theory in |
202 Isabelle/HOL, but take a completely different approach to regular |
202 Isabelle/HOL, but take a different approach to regular |
203 languages. Instead of defining a regular language as one where there exists |
203 languages. Instead of defining a regular language as one where there exists |
204 an automaton that recognises all strings of the language, we define a |
204 an automaton that recognises all strings of the language, we define a |
205 regular language as: |
205 regular language as: |
206 |
206 |
207 \begin{definition} |
207 \begin{definition} |
223 complementation, for regular languages.\smallskip |
223 complementation, for regular languages.\smallskip |
224 |
224 |
225 \noindent |
225 \noindent |
226 {\bf Contributions:} |
226 {\bf Contributions:} |
227 There is an extensive literature on regular languages. |
227 There is an extensive literature on regular languages. |
228 To our knowledge, our proof of the Myhill-Nerode theorem is the |
228 To our best knowledge, our proof of the Myhill-Nerode theorem is the |
229 first that is based on regular expressions, only. We prove the part of this theorem |
229 first that is based on regular expressions, only. We prove the part of this theorem |
230 stating that a regular expression has only finitely many partitions using certain |
230 stating that a regular expression has only finitely many partitions using certain |
231 tagging-functions. Again to our best knowledge, these tagging-functions have |
231 tagging-functions. Again to our best knowledge, these tagging-functions have |
232 not been used before to establish the Myhill-Nerode theorem. |
232 not been used before to establish the Myhill-Nerode theorem. |
233 *} |
233 *} |
463 \end{center} |
463 \end{center} |
464 |
464 |
465 \noindent |
465 \noindent |
466 where the terms @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} |
466 where the terms @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} |
467 stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow> |
467 stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow> |
468 X\<^isub>i"}. If viewed as an automaton, then every equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system |
468 X\<^isub>i"}. |
469 corresponds roughly to a state whose name is @{text X\<^isub>i} and its predecessor states |
469 %The intuition behind the equational system is that every |
470 are the @{text "Y\<^isub>i\<^isub>j"}; the @{text "c\<^isub>i\<^isub>j"} are the labels of the transitions from these |
470 %equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system |
471 predecessor states to @{text X\<^isub>i}. In our initial equation system there can only be |
471 %corresponds roughly to a state of an automaton whose name is @{text X\<^isub>i} and its predecessor states |
|
472 %are the @{text "Y\<^isub>i\<^isub>j"}; the @{text "c\<^isub>i\<^isub>j"} are the labels of the transitions from these |
|
473 %predecessor states to @{text X\<^isub>i}. |
|
474 There can only be |
472 finitely many terms of the form @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} in a right-hand side |
475 finitely many terms of the form @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} in a right-hand side |
473 since by assumption there are only finitely many |
476 since by assumption there are only finitely many |
474 equivalence classes and only finitely many characters. The term @{text |
477 equivalence classes and only finitely many characters. |
475 "\<lambda>(EMPTY)"} in the first equation acts as a marker for the equivalence class |
478 The term @{text "\<lambda>(EMPTY)"} in the first equation acts as a marker for the initial state, that |
|
479 is the equivalence class |
476 containing @{text "[]"}.\footnote{Note that we mark, roughly speaking, the |
480 containing @{text "[]"}.\footnote{Note that we mark, roughly speaking, the |
477 single `initial' state in the equational system, which is different from |
481 single `initial' state in the equational system, which is different from |
478 the method by Brzozowski \cite{Brzozowski64}, where he marks the |
482 the method by Brzozowski \cite{Brzozowski64}, where he marks the |
479 `terminal' states. We are forced to set up the equational system in our |
483 `terminal' states. We are forced to set up the equational system in our |
480 way, because the Myhill-Nerode relation determines the `direction' of the |
484 way, because the Myhill-Nerode relation determines the `direction' of the |
481 transitions---the successor `state' of an equivalence class @{text Y} can |
485 transitions---the successor `state' of an equivalence class @{text Y} can |
482 be reached by adding a character to the end of @{text Y}. This is also the |
486 be reached by adding a character to the end of @{text Y}. This is also the |
483 reason why we have to use our reverse version of Arden's Lemma.} |
487 reason why we have to use our reverse version of Arden's Lemma.} |
|
488 %In our initial equation system there can only be |
|
489 %finitely many terms of the form @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} in a right-hand side |
|
490 %since by assumption there are only finitely many |
|
491 %equivalence classes and only finitely many characters. |
484 Overloading the function @{text \<calL>} for the two kinds of terms in the |
492 Overloading the function @{text \<calL>} for the two kinds of terms in the |
485 equational system, we have |
493 equational system, we have |
486 |
494 |
487 \begin{center} |
495 \begin{center} |
488 @{text "\<calL>(Y, r) \<equiv>"} % |
496 @{text "\<calL>(Y, r) \<equiv>"} % |
499 |
507 |
500 \noindent |
508 \noindent |
501 hold. Similarly for @{text "X\<^isub>1"} we can show the following equation |
509 hold. Similarly for @{text "X\<^isub>1"} we can show the following equation |
502 % |
510 % |
503 \begin{equation}\label{inv2} |
511 \begin{equation}\label{inv2} |
504 @{text "X\<^isub>1 = \<calL>(Y\<^isub>i\<^isub>1, CHAR c\<^isub>i\<^isub>1) \<union> \<dots> \<union> \<calL>(Y\<^isub>i\<^isub>p, CHAR c\<^isub>i\<^isub>p) \<union> \<calL>(\<lambda>(EMPTY))"}. |
512 @{text "X\<^isub>1 = \<calL>(Y\<^isub>1\<^isub>1, CHAR c\<^isub>1\<^isub>1) \<union> \<dots> \<union> \<calL>(Y\<^isub>1\<^isub>p, CHAR c\<^isub>1\<^isub>p) \<union> \<calL>(\<lambda>(EMPTY))"}. |
505 \end{equation} |
513 \end{equation} |
506 |
514 |
507 \noindent |
515 \noindent |
508 The reason for adding the @{text \<lambda>}-marker to our initial equational system is |
516 The reason for adding the @{text \<lambda>}-marker to our initial equational system is |
509 to obtain this equation: it only holds with the marker, since none of |
517 to obtain this equation: it only holds with the marker, since none of |
852 every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is |
860 every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is |
853 a subset of @{term "UNIV // \<approx>A"}, we also know that for every equivalence class |
861 a subset of @{term "UNIV // \<approx>A"}, we also know that for every equivalence class |
854 in @{term "finals A"} there exists a regular expression. Moreover by assumption |
862 in @{term "finals A"} there exists a regular expression. Moreover by assumption |
855 we know that @{term "finals A"} must be finite, and therefore there must be a finite |
863 we know that @{term "finals A"} must be finite, and therefore there must be a finite |
856 set of regular expressions @{text "rs"} such that |
864 set of regular expressions @{text "rs"} such that |
857 |
865 @{term "\<Union>(finals A) = L (\<Uplus>rs)"}. |
858 \begin{center} |
|
859 @{term "\<Union>(finals A) = L (\<Uplus>rs)"} |
|
860 \end{center} |
|
861 |
|
862 \noindent |
|
863 Since the left-hand side is equal to @{text A}, we can use @{term "\<Uplus>rs"} |
866 Since the left-hand side is equal to @{text A}, we can use @{term "\<Uplus>rs"} |
864 as the regular expression that is needed in the theorem.\qed |
867 as the regular expression that is needed in the theorem.\qed |
865 \end{proof} |
868 \end{proof} |
866 *} |
869 *} |
867 |
870 |
1039 % |
1042 % |
1040 \noindent |
1043 \noindent |
1041 and \emph{string subtraction}: |
1044 and \emph{string subtraction}: |
1042 % |
1045 % |
1043 \begin{center} |
1046 \begin{center} |
1044 \begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{1mm}}l} |
1047 @{text "[] - y \<equiv> []"}\hspace{10mm} |
1045 @{text "[] - y"} & @{text "\<equiv>"} & @{text "[]"}\\ |
1048 @{text "x - [] \<equiv> x"}\hspace{10mm} |
1046 @{text "x - []"} & @{text "\<equiv>"} & @{text x}\\ |
1049 @{text "cx - dy \<equiv> if c = d then x - y else cx"} |
1047 @{text "cx - dy"} & @{text "\<equiv>"} & @{text "if c = d then x - y else cx"}\\ |
|
1048 \end{tabular} |
|
1049 \end{center} |
1050 \end{center} |
1050 % |
1051 % |
1051 \noindent |
1052 \noindent |
1052 where @{text c} and @{text d} are characters, and @{text x} and @{text y} are strings. |
1053 where @{text c} and @{text d} are characters, and @{text x} and @{text y} are strings. |
1053 |
1054 |
1054 Now assuming @{term "x @ z \<in> A ;; B"} there are only two possible ways of how to `split' |
1055 Now assuming @{term "x @ z \<in> A ;; B"} there are only two possible ways of how to `split' |
1055 this string to be in @{term "A ;; B"}: |
1056 this string to be in @{term "A ;; B"}: |
1056 % |
1057 % |
1057 \begin{center} |
1058 \begin{center} |
|
1059 \begin{tabular}{@ {}c@ {\hspace{10mm}}c@ {}} |
1058 \scalebox{0.7}{ |
1060 \scalebox{0.7}{ |
1059 \begin{tikzpicture} |
1061 \begin{tikzpicture} |
1060 \node[draw,minimum height=3.8ex] (xa) { $\hspace{4em}@{text "x'"}\hspace{4em}$ }; |
1062 \node[draw,minimum height=3.8ex] (xa) { $\hspace{3em}@{text "x'"}\hspace{3em}$ }; |
1061 \node[draw,minimum height=3.8ex, right=-0.03em of xa] (xxa) { $\hspace{0.5em}@{text "x - x'"}\hspace{0.5em}$ }; |
1063 \node[draw,minimum height=3.8ex, right=-0.03em of xa] (xxa) { $\hspace{0.2em}@{text "x - x'"}\hspace{0.2em}$ }; |
1062 \node[draw,minimum height=3.8ex, right=-0.03em of xxa] (z) { $\hspace{10.1em}@{text z}\hspace{10.1em}$ }; |
1064 \node[draw,minimum height=3.8ex, right=-0.03em of xxa] (z) { $\hspace{5em}@{text z}\hspace{5em}$ }; |
1063 |
1065 |
1064 \draw[decoration={brace,transform={yscale=3}},decorate] |
1066 \draw[decoration={brace,transform={yscale=3}},decorate] |
1065 (xa.north west) -- ($(xxa.north east)+(0em,0em)$) |
1067 (xa.north west) -- ($(xxa.north east)+(0em,0em)$) |
1066 node[midway, above=0.5em]{@{text x}}; |
1068 node[midway, above=0.5em]{@{text x}}; |
1067 |
1069 |
1079 |
1081 |
1080 \draw[decoration={brace,transform={yscale=3}},decorate] |
1082 \draw[decoration={brace,transform={yscale=3}},decorate] |
1081 ($(xa.south east)+(0em,0ex)$) -- ($(xa.south west)+(0em,0ex)$) |
1083 ($(xa.south east)+(0em,0ex)$) -- ($(xa.south west)+(0em,0ex)$) |
1082 node[midway, below=0.5em]{@{term "x' \<in> A"}}; |
1084 node[midway, below=0.5em]{@{term "x' \<in> A"}}; |
1083 \end{tikzpicture}} |
1085 \end{tikzpicture}} |
1084 |
1086 & |
1085 \scalebox{0.7}{ |
1087 \scalebox{0.7}{ |
1086 \begin{tikzpicture} |
1088 \begin{tikzpicture} |
1087 \node[draw,minimum height=3.8ex] (x) { $\hspace{6.5em}@{text x}\hspace{6.5em}$ }; |
1089 \node[draw,minimum height=3.8ex] (x) { $\hspace{4.8em}@{text x}\hspace{4.8em}$ }; |
1088 \node[draw,minimum height=3.8ex, right=-0.03em of x] (za) { $\hspace{2em}@{text "z'"}\hspace{2em}$ }; |
1090 \node[draw,minimum height=3.8ex, right=-0.03em of x] (za) { $\hspace{0.6em}@{text "z'"}\hspace{0.6em}$ }; |
1089 \node[draw,minimum height=3.8ex, right=-0.03em of za] (zza) { $\hspace{6.1em}@{text "z - z'"}\hspace{6.1em}$ }; |
1091 \node[draw,minimum height=3.8ex, right=-0.03em of za] (zza) { $\hspace{2.6em}@{text "z - z'"}\hspace{2.6em}$ }; |
1090 |
1092 |
1091 \draw[decoration={brace,transform={yscale=3}},decorate] |
1093 \draw[decoration={brace,transform={yscale=3}},decorate] |
1092 (x.north west) -- ($(za.north west)+(0em,0em)$) |
1094 (x.north west) -- ($(za.north west)+(0em,0em)$) |
1093 node[midway, above=0.5em]{@{text x}}; |
1095 node[midway, above=0.5em]{@{text x}}; |
1094 |
1096 |
1106 |
1108 |
1107 \draw[decoration={brace,transform={yscale=3}},decorate] |
1109 \draw[decoration={brace,transform={yscale=3}},decorate] |
1108 ($(zza.south east)+(0em,0ex)$) -- ($(za.south east)+(0em,0ex)$) |
1110 ($(zza.south east)+(0em,0ex)$) -- ($(za.south east)+(0em,0ex)$) |
1109 node[midway, below=0.5em]{@{text "(z - z') \<in> B"}}; |
1111 node[midway, below=0.5em]{@{text "(z - z') \<in> B"}}; |
1110 \end{tikzpicture}} |
1112 \end{tikzpicture}} |
|
1113 \end{tabular} |
1111 \end{center} |
1114 \end{center} |
1112 % |
1115 % |
1113 \noindent |
1116 \noindent |
1114 Either there is a prefix of @{text x} in @{text A} and the rest is in @{text B} (first picture), |
1117 Either there is a prefix of @{text x} in @{text A} and the rest is in @{text B} (first picture), |
1115 or @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} (second picture). |
1118 or @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} (second picture). |
1305 \end{center} |
1308 \end{center} |
1306 % |
1309 % |
1307 \noindent |
1310 \noindent |
1308 holds for any strings @{text "s\<^isub>1"} and @{text |
1311 holds for any strings @{text "s\<^isub>1"} and @{text |
1309 "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"} give rise to the same |
1312 "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"} give rise to the same |
1310 partitions. Proving the existence of such a regular expression via automata would |
1313 partitions. Proving the existence of such a regular expression via automata |
|
1314 using the standard method would |
1311 be quite involved. It includes the |
1315 be quite involved. It includes the |
1312 steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text |
1316 steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text |
1313 "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"} |
1317 "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"} |
1314 regular expression. |
1318 regular expression. |
1315 |
1319 |
1357 Lemma. |
1361 Lemma. |
1358 |
1362 |
1359 We briefly considered using the method Brzozowski presented in the Appendix |
1363 We briefly considered using the method Brzozowski presented in the Appendix |
1360 of~\cite{Brzozowski64} in order to prove the second direction of the |
1364 of~\cite{Brzozowski64} in order to prove the second direction of the |
1361 Myhill-Nerode theorem. There he calculates the derivatives for regular |
1365 Myhill-Nerode theorem. There he calculates the derivatives for regular |
1362 expressions and shows that there can be only finitely many of them with |
1366 expressions and shows that for every language there can be only |
1363 respect to a language (if regarded equal modulo ACI). We could |
1367 finitely many of them %derivations |
1364 have used as the tag of a string @{text s} the set of derivatives of a regular expression |
1368 (if regarded equal modulo ACI). We could |
1365 generated by a language. Using the fact that two strings are |
1369 have used as tagging-function the set of derivatives of a regular expression |
|
1370 with respect to a language. Using the fact that two strings are |
1366 Myhill-Nerode related whenever their derivative is the same, together with |
1371 Myhill-Nerode related whenever their derivative is the same, together with |
1367 the fact that there are only finitely such derivatives |
1372 the fact that there are only finitely such derivatives |
1368 would give us a similar argument as ours. However it seems not so easy to |
1373 would give us a similar argument as ours. However it seems not so easy to |
1369 calculate the set of derivatives modulo ACI and then to count them. Therefore we preferred our |
1374 calculate the set of derivatives modulo ACI. Therefore we preferred our |
1370 direct method of using tagging-functions. This |
1375 direct method of using tagging-functions. This |
1371 is also where our method shines, because we can completely side-step the |
1376 is also where our method shines, because we can completely side-step the |
1372 standard argument \cite{Kozen97} where automata need to be composed, which |
1377 standard argument \cite{Kozen97} where automata need to be composed, which |
1373 as stated in the Introduction is not so convenient to formalise in a |
1378 as stated in the Introduction is not so easy to formalise in a |
1374 HOL-based theorem prover. However, it is also the direction where we had to |
1379 HOL-based theorem prover. However, it is also the direction where we had to |
1375 spend most of the `conceptual' time, as our proof-argument based on tagging-functions |
1380 spend most of the `conceptual' time, as our proof-argument based on tagging-functions |
1376 is new for establishing the Myhill-Nerode theorem. All standard proofs |
1381 is new for establishing the Myhill-Nerode theorem. All standard proofs |
1377 of this direction proceed by arguments over automata. |
1382 of this direction use %proceed by |
|
1383 arguments over automata.\\[-6mm]%\medskip |
|
1384 % |
|
1385 %\noindent |
|
1386 %{\bf Acknowledgements:} We are grateful for the comments we received from Larry |
|
1387 %Paulson and the referees of the paper. |
1378 |
1388 |
1379 *} |
1389 *} |
1380 |
1390 |
1381 |
1391 |
1382 (*<*) |
1392 (*<*) |