Paper/Paper.thy
changeset 159 990c12ab1562
parent 157 10d2d0cbe381
child 160 ea2e5acbfe4a
equal deleted inserted replaced
158:3c9129f49846 159:990c12ab1562
    74   In case of graphs and matrices, this means we have to build our own
    74   In case of graphs and matrices, this means we have to build our own
    75   reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
    75   reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
    76   HOLlight support them with libraries. Even worse, reasoning about graphs and
    76   HOLlight support them with libraries. Even worse, reasoning about graphs and
    77   matrices can be a real hassle in HOL-based theorem provers.  Consider for
    77   matrices can be a real hassle in HOL-based theorem provers.  Consider for
    78   example the operation of sequencing two automata, say $A_1$ and $A_2$, by
    78   example the operation of sequencing two automata, say $A_1$ and $A_2$, by
    79   connecting the accepting states of $A_1$ to the initial state of $A_2$:  
    79   connecting the accepting states of $A_1$ to the initial state of $A_2$:\\[-5.5mm]  
    80   
    80   %  
    81   \begin{center}
    81   \begin{center}
    82   \begin{tabular}{ccc}
    82   \begin{tabular}{ccc}
    83   \begin{tikzpicture}[scale=0.8]
    83   \begin{tikzpicture}[scale=0.8]
    84   %\draw[step=2mm] (-1,-1) grid (1,1);
    84   %\draw[step=2mm] (-1,-1) grid (1,1);
    85   
    85   
   197   bit strings in the context of Presburger arithmetic.
   197   bit strings in the context of Presburger arithmetic.
   198   The only larger formalisations of automata theory 
   198   The only larger formalisations of automata theory 
   199   are carried out in Nuprl \cite{Constable00} and in Coq \cite{Filliatre97}.
   199   are carried out in Nuprl \cite{Constable00} and in Coq \cite{Filliatre97}.
   200   
   200   
   201   In this paper, we will not attempt to formalise automata theory in
   201   In this paper, we will not attempt to formalise automata theory in
   202   Isabelle/HOL, but take a completely different approach to regular
   202   Isabelle/HOL, but take a different approach to regular
   203   languages. Instead of defining a regular language as one where there exists
   203   languages. Instead of defining a regular language as one where there exists
   204   an automaton that recognises all strings of the language, we define a
   204   an automaton that recognises all strings of the language, we define a
   205   regular language as:
   205   regular language as:
   206 
   206 
   207   \begin{definition}
   207   \begin{definition}
   223   complementation, for regular languages.\smallskip
   223   complementation, for regular languages.\smallskip
   224   
   224   
   225   \noindent
   225   \noindent
   226   {\bf Contributions:} 
   226   {\bf Contributions:} 
   227   There is an extensive literature on regular languages.
   227   There is an extensive literature on regular languages.
   228   To our knowledge, our proof of the Myhill-Nerode theorem is the
   228   To our best knowledge, our proof of the Myhill-Nerode theorem is the
   229   first that is based on regular expressions, only. We prove the part of this theorem 
   229   first that is based on regular expressions, only. We prove the part of this theorem 
   230   stating that a regular expression has only finitely many partitions using certain 
   230   stating that a regular expression has only finitely many partitions using certain 
   231   tagging-functions. Again to our best knowledge, these tagging-functions have
   231   tagging-functions. Again to our best knowledge, these tagging-functions have
   232   not been used before to establish the Myhill-Nerode theorem.
   232   not been used before to establish the Myhill-Nerode theorem.
   233 *}
   233 *}
   463   \end{center}
   463   \end{center}
   464 
   464 
   465   \noindent
   465   \noindent
   466   where the terms @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"}
   466   where the terms @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"}
   467   stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow>
   467   stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow>
   468   X\<^isub>i"}. If viewed as an automaton, then every equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system
   468   X\<^isub>i"}. 
   469   corresponds roughly to a state whose name is @{text X\<^isub>i} and its predecessor states 
   469   %The intuition behind the equational system is that every 
   470   are the @{text "Y\<^isub>i\<^isub>j"}; the @{text "c\<^isub>i\<^isub>j"} are the labels of the transitions from these 
   470   %equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system
   471   predecessor states to @{text X\<^isub>i}. In our initial equation system there can only be
   471   %corresponds roughly to a state of an automaton whose name is @{text X\<^isub>i} and its predecessor states 
       
   472   %are the @{text "Y\<^isub>i\<^isub>j"}; the @{text "c\<^isub>i\<^isub>j"} are the labels of the transitions from these 
       
   473   %predecessor states to @{text X\<^isub>i}. 
       
   474   There can only be
   472   finitely many terms of the form @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} in a right-hand side 
   475   finitely many terms of the form @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} in a right-hand side 
   473   since by assumption there are only finitely many
   476   since by assumption there are only finitely many
   474   equivalence classes and only finitely many characters. The term @{text
   477   equivalence classes and only finitely many characters.
   475   "\<lambda>(EMPTY)"} in the first equation acts as a marker for the equivalence class
   478   The term @{text "\<lambda>(EMPTY)"} in the first equation acts as a marker for the initial state, that
       
   479   is the equivalence class
   476   containing @{text "[]"}.\footnote{Note that we mark, roughly speaking, the
   480   containing @{text "[]"}.\footnote{Note that we mark, roughly speaking, the
   477   single `initial' state in the equational system, which is different from
   481   single `initial' state in the equational system, which is different from
   478   the method by Brzozowski \cite{Brzozowski64}, where he marks the
   482   the method by Brzozowski \cite{Brzozowski64}, where he marks the
   479   `terminal' states. We are forced to set up the equational system in our
   483   `terminal' states. We are forced to set up the equational system in our
   480   way, because the Myhill-Nerode relation determines the `direction' of the
   484   way, because the Myhill-Nerode relation determines the `direction' of the
   481   transitions---the successor `state' of an equivalence class @{text Y} can
   485   transitions---the successor `state' of an equivalence class @{text Y} can
   482   be reached by adding a character to the end of @{text Y}. This is also the
   486   be reached by adding a character to the end of @{text Y}. This is also the
   483   reason why we have to use our reverse version of Arden's Lemma.}
   487   reason why we have to use our reverse version of Arden's Lemma.}
       
   488   %In our initial equation system there can only be
       
   489   %finitely many terms of the form @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} in a right-hand side 
       
   490   %since by assumption there are only finitely many
       
   491   %equivalence classes and only finitely many characters. 
   484   Overloading the function @{text \<calL>} for the two kinds of terms in the
   492   Overloading the function @{text \<calL>} for the two kinds of terms in the
   485   equational system, we have
   493   equational system, we have
   486   
   494   
   487   \begin{center}
   495   \begin{center}
   488   @{text "\<calL>(Y, r) \<equiv>"} %
   496   @{text "\<calL>(Y, r) \<equiv>"} %
   499 
   507 
   500   \noindent
   508   \noindent
   501   hold. Similarly for @{text "X\<^isub>1"} we can show the following equation
   509   hold. Similarly for @{text "X\<^isub>1"} we can show the following equation
   502   %
   510   %
   503   \begin{equation}\label{inv2}
   511   \begin{equation}\label{inv2}
   504   @{text "X\<^isub>1 = \<calL>(Y\<^isub>i\<^isub>1, CHAR c\<^isub>i\<^isub>1) \<union> \<dots> \<union> \<calL>(Y\<^isub>i\<^isub>p, CHAR c\<^isub>i\<^isub>p) \<union> \<calL>(\<lambda>(EMPTY))"}.
   512   @{text "X\<^isub>1 = \<calL>(Y\<^isub>1\<^isub>1, CHAR c\<^isub>1\<^isub>1) \<union> \<dots> \<union> \<calL>(Y\<^isub>1\<^isub>p, CHAR c\<^isub>1\<^isub>p) \<union> \<calL>(\<lambda>(EMPTY))"}.
   505   \end{equation}
   513   \end{equation}
   506 
   514 
   507   \noindent
   515   \noindent
   508   The reason for adding the @{text \<lambda>}-marker to our initial equational system is 
   516   The reason for adding the @{text \<lambda>}-marker to our initial equational system is 
   509   to obtain this equation: it only holds with the marker, since none of 
   517   to obtain this equation: it only holds with the marker, since none of 
   651   \end{tabular}
   659   \end{tabular}
   652   \end{center}
   660   \end{center}
   653 
   661 
   654   \noindent
   662   \noindent
   655   The last definition we need applies @{term Iter} over and over until a condition 
   663   The last definition we need applies @{term Iter} over and over until a condition 
   656   @{text Cond} is \emph{not} satisfied anymore. The condition states that there
   664   @{text Cond} is \emph{not} satisfied anymore. This condition states that there
   657   are more than one equation left in the equational system @{text ES}. To solve
   665   are more than one equation left in the equational system @{text ES}. To solve
   658   an equational system we use Isabelle/HOL's @{text while}-operator as follows:
   666   an equational system we use Isabelle/HOL's @{text while}-operator as follows:
   659   
   667   
   660   \begin{center}
   668   \begin{center}
   661   @{thm Solve_def}
   669   @{thm Solve_def}
   852   every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is
   860   every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is
   853   a subset of  @{term "UNIV // \<approx>A"}, we also know that for every equivalence class
   861   a subset of  @{term "UNIV // \<approx>A"}, we also know that for every equivalence class
   854   in @{term "finals A"} there exists a regular expression. Moreover by assumption 
   862   in @{term "finals A"} there exists a regular expression. Moreover by assumption 
   855   we know that @{term "finals A"} must be finite, and therefore there must be a finite
   863   we know that @{term "finals A"} must be finite, and therefore there must be a finite
   856   set of regular expressions @{text "rs"} such that
   864   set of regular expressions @{text "rs"} such that
   857 
   865   @{term "\<Union>(finals A) = L (\<Uplus>rs)"}.
   858   \begin{center}
       
   859   @{term "\<Union>(finals A) = L (\<Uplus>rs)"}
       
   860   \end{center}
       
   861 
       
   862   \noindent
       
   863   Since the left-hand side is equal to @{text A}, we can use @{term "\<Uplus>rs"} 
   866   Since the left-hand side is equal to @{text A}, we can use @{term "\<Uplus>rs"} 
   864   as the regular expression that is needed in the theorem.\qed
   867   as the regular expression that is needed in the theorem.\qed
   865   \end{proof}
   868   \end{proof}
   866 *}
   869 *}
   867 
   870 
   915   We formally define the notion of a \emph{tagging-relation} as follows.
   918   We formally define the notion of a \emph{tagging-relation} as follows.
   916 
   919 
   917   \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
   920   \begin{definition}[Tagging-Relation] Given a tagging-function @{text tag}, then two strings @{text x}
   918   and @{text y} are \emph{tag-related} provided
   921   and @{text y} are \emph{tag-related} provided
   919   \begin{center}
   922   \begin{center}
   920   @{text "x =tag= y \<equiv> tag x = tag y"}.
   923   @{text "x =tag= y \<equiv> tag x = tag y"}\;.
   921   \end{center}
   924   \end{center}
   922   \end{definition}
   925   \end{definition}
   923 
   926 
   924 
   927 
   925   In order to establish finiteness of a set @{text A}, we shall use the following powerful
   928   In order to establish finiteness of a set @{text A}, we shall use the following powerful
  1039   %
  1042   %
  1040   \noindent
  1043   \noindent
  1041   and \emph{string subtraction}:
  1044   and \emph{string subtraction}:
  1042   %
  1045   %
  1043   \begin{center}
  1046   \begin{center}
  1044   \begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
  1047   @{text "[] - y \<equiv> []"}\hspace{10mm}
  1045   @{text "[] - y"} & @{text "\<equiv>"} & @{text "[]"}\\
  1048   @{text "x - [] \<equiv> x"}\hspace{10mm}
  1046   @{text "x - []"} & @{text "\<equiv>"} & @{text x}\\
  1049   @{text "cx - dy \<equiv> if c = d then x - y else cx"}
  1047   @{text "cx - dy"} & @{text "\<equiv>"} & @{text "if c = d then x - y else cx"}\\
       
  1048   \end{tabular}
       
  1049   \end{center}
  1050   \end{center}
  1050   %
  1051   %
  1051   \noindent
  1052   \noindent
  1052   where @{text c} and @{text d} are characters, and @{text x} and @{text y} are strings. 
  1053   where @{text c} and @{text d} are characters, and @{text x} and @{text y} are strings. 
  1053   
  1054   
  1054   Now assuming  @{term "x @ z \<in> A ;; B"} there are only two possible ways of how to `split' 
  1055   Now assuming  @{term "x @ z \<in> A ;; B"} there are only two possible ways of how to `split' 
  1055   this string to be in @{term "A ;; B"}:
  1056   this string to be in @{term "A ;; B"}:
  1056   %
  1057   %
  1057   \begin{center}
  1058   \begin{center}
       
  1059   \begin{tabular}{@ {}c@ {\hspace{10mm}}c@ {}}
  1058   \scalebox{0.7}{
  1060   \scalebox{0.7}{
  1059   \begin{tikzpicture}
  1061   \begin{tikzpicture}
  1060     \node[draw,minimum height=3.8ex] (xa) { $\hspace{4em}@{text "x'"}\hspace{4em}$ };
  1062     \node[draw,minimum height=3.8ex] (xa) { $\hspace{3em}@{text "x'"}\hspace{3em}$ };
  1061     \node[draw,minimum height=3.8ex, right=-0.03em of xa] (xxa) { $\hspace{0.5em}@{text "x - x'"}\hspace{0.5em}$ };
  1063     \node[draw,minimum height=3.8ex, right=-0.03em of xa] (xxa) { $\hspace{0.2em}@{text "x - x'"}\hspace{0.2em}$ };
  1062     \node[draw,minimum height=3.8ex, right=-0.03em of xxa] (z) { $\hspace{10.1em}@{text z}\hspace{10.1em}$ };
  1064     \node[draw,minimum height=3.8ex, right=-0.03em of xxa] (z) { $\hspace{5em}@{text z}\hspace{5em}$ };
  1063 
  1065 
  1064     \draw[decoration={brace,transform={yscale=3}},decorate]
  1066     \draw[decoration={brace,transform={yscale=3}},decorate]
  1065            (xa.north west) -- ($(xxa.north east)+(0em,0em)$)
  1067            (xa.north west) -- ($(xxa.north east)+(0em,0em)$)
  1066                node[midway, above=0.5em]{@{text x}};
  1068                node[midway, above=0.5em]{@{text x}};
  1067 
  1069 
  1079 
  1081 
  1080     \draw[decoration={brace,transform={yscale=3}},decorate]
  1082     \draw[decoration={brace,transform={yscale=3}},decorate]
  1081            ($(xa.south east)+(0em,0ex)$) -- ($(xa.south west)+(0em,0ex)$)
  1083            ($(xa.south east)+(0em,0ex)$) -- ($(xa.south west)+(0em,0ex)$)
  1082                node[midway, below=0.5em]{@{term "x' \<in> A"}};
  1084                node[midway, below=0.5em]{@{term "x' \<in> A"}};
  1083   \end{tikzpicture}}
  1085   \end{tikzpicture}}
  1084 
  1086   &
  1085   \scalebox{0.7}{
  1087   \scalebox{0.7}{
  1086   \begin{tikzpicture}
  1088   \begin{tikzpicture}
  1087     \node[draw,minimum height=3.8ex] (x) { $\hspace{6.5em}@{text x}\hspace{6.5em}$ };
  1089     \node[draw,minimum height=3.8ex] (x) { $\hspace{4.8em}@{text x}\hspace{4.8em}$ };
  1088     \node[draw,minimum height=3.8ex, right=-0.03em of x] (za) { $\hspace{2em}@{text "z'"}\hspace{2em}$ };
  1090     \node[draw,minimum height=3.8ex, right=-0.03em of x] (za) { $\hspace{0.6em}@{text "z'"}\hspace{0.6em}$ };
  1089     \node[draw,minimum height=3.8ex, right=-0.03em of za] (zza) { $\hspace{6.1em}@{text "z - z'"}\hspace{6.1em}$  };
  1091     \node[draw,minimum height=3.8ex, right=-0.03em of za] (zza) { $\hspace{2.6em}@{text "z - z'"}\hspace{2.6em}$  };
  1090 
  1092 
  1091     \draw[decoration={brace,transform={yscale=3}},decorate]
  1093     \draw[decoration={brace,transform={yscale=3}},decorate]
  1092            (x.north west) -- ($(za.north west)+(0em,0em)$)
  1094            (x.north west) -- ($(za.north west)+(0em,0em)$)
  1093                node[midway, above=0.5em]{@{text x}};
  1095                node[midway, above=0.5em]{@{text x}};
  1094 
  1096 
  1106 
  1108 
  1107     \draw[decoration={brace,transform={yscale=3}},decorate]
  1109     \draw[decoration={brace,transform={yscale=3}},decorate]
  1108            ($(zza.south east)+(0em,0ex)$) -- ($(za.south east)+(0em,0ex)$)
  1110            ($(zza.south east)+(0em,0ex)$) -- ($(za.south east)+(0em,0ex)$)
  1109                node[midway, below=0.5em]{@{text "(z - z') \<in> B"}};
  1111                node[midway, below=0.5em]{@{text "(z - z') \<in> B"}};
  1110   \end{tikzpicture}}
  1112   \end{tikzpicture}}
       
  1113   \end{tabular}
  1111   \end{center}
  1114   \end{center}
  1112   %
  1115   %
  1113   \noindent
  1116   \noindent
  1114   Either there is a prefix of @{text x} in @{text A} and the rest is in @{text B} (first picture),
  1117   Either there is a prefix of @{text x} in @{text A} and the rest is in @{text B} (first picture),
  1115   or @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} (second picture).
  1118   or @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} (second picture).
  1305   \end{center}
  1308   \end{center}
  1306   %
  1309   %
  1307   \noindent
  1310   \noindent
  1308   holds for any strings @{text "s\<^isub>1"} and @{text
  1311   holds for any strings @{text "s\<^isub>1"} and @{text
  1309   "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"} give rise to the same
  1312   "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"} give rise to the same
  1310   partitions.  Proving the existence of such a regular expression via automata would 
  1313   partitions.  Proving the existence of such a regular expression via automata 
       
  1314   using the standard method would 
  1311   be quite involved. It includes the
  1315   be quite involved. It includes the
  1312   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
  1316   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
  1313   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
  1317   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
  1314   regular expression.
  1318   regular expression.
  1315 
  1319 
  1357   Lemma.
  1361   Lemma.
  1358 
  1362 
  1359   We briefly considered using the method Brzozowski presented in the Appendix
  1363   We briefly considered using the method Brzozowski presented in the Appendix
  1360   of~\cite{Brzozowski64} in order to prove the second direction of the
  1364   of~\cite{Brzozowski64} in order to prove the second direction of the
  1361   Myhill-Nerode theorem. There he calculates the derivatives for regular
  1365   Myhill-Nerode theorem. There he calculates the derivatives for regular
  1362   expressions and shows that there can be only finitely many of them with 
  1366   expressions and shows that for every language there can be only 
  1363   respect to a language (if regarded equal modulo ACI). We could
  1367   finitely many of them %derivations
  1364   have used as the tag of a string @{text s} the set of derivatives of a regular expression
  1368   (if regarded equal modulo ACI). We could
  1365   generated by a language.  Using the fact that two strings are
  1369   have used as tagging-function the set of derivatives of a regular expression
       
  1370   with respect to a language.  Using the fact that two strings are
  1366   Myhill-Nerode related whenever their derivative is the same, together with
  1371   Myhill-Nerode related whenever their derivative is the same, together with
  1367   the fact that there are only finitely such derivatives
  1372   the fact that there are only finitely such derivatives
  1368   would give us a similar argument as ours. However it seems not so easy to
  1373   would give us a similar argument as ours. However it seems not so easy to
  1369   calculate the set of derivatives modulo ACI and then to count them. Therefore we preferred our
  1374   calculate the set of derivatives modulo ACI. Therefore we preferred our
  1370   direct method of using tagging-functions. This
  1375   direct method of using tagging-functions. This
  1371   is also where our method shines, because we can completely side-step the
  1376   is also where our method shines, because we can completely side-step the
  1372   standard argument \cite{Kozen97} where automata need to be composed, which
  1377   standard argument \cite{Kozen97} where automata need to be composed, which
  1373   as stated in the Introduction is not so convenient to formalise in a 
  1378   as stated in the Introduction is not so easy to formalise in a 
  1374   HOL-based theorem prover. However, it is also the direction where we had to 
  1379   HOL-based theorem prover. However, it is also the direction where we had to 
  1375   spend most of the `conceptual' time, as our proof-argument based on tagging-functions
  1380   spend most of the `conceptual' time, as our proof-argument based on tagging-functions
  1376   is new for establishing the Myhill-Nerode theorem. All standard proofs 
  1381   is new for establishing the Myhill-Nerode theorem. All standard proofs 
  1377   of this direction proceed by arguments over automata.
  1382   of this direction use %proceed by 
       
  1383   arguments over automata.\\[-6mm]%\medskip
       
  1384   %
       
  1385   %\noindent
       
  1386   %{\bf Acknowledgements:} We are grateful for the comments we received from Larry
       
  1387   %Paulson and the referees of the paper.
  1378 
  1388 
  1379 *}
  1389 *}
  1380 
  1390 
  1381 
  1391 
  1382 (*<*)
  1392 (*<*)