249 \end{equation} |
250 \end{equation} |
250 |
251 |
251 \noindent |
252 \noindent |
252 changes the type---the disjoint union is not a set, but a set of |
253 changes the type---the disjoint union is not a set, but a set of |
253 pairs. Using this definition for disjoint union means we do not have a |
254 pairs. Using this definition for disjoint union means we do not have a |
254 single type for automata. As a result we will not be able to define a |
255 single type for the states of automata. As a result we will not be able to define a |
255 regular language as one for which there exists an automaton that recognises |
256 regular language as one for which there exists an automaton that recognises |
256 all its strings (Definition~\ref{baddef}). This is because we cannot make a |
257 all its strings (Definition~\ref{baddef}). This is because we cannot make a |
257 definition in HOL that is polymorphic in the state type and there is no type |
258 definition in HOL that is only polymorphic in the state type and there is no type |
258 quantification available in HOL (unlike in Coq, for example).\footnote{Slind |
259 quantification available in HOL (unlike in Coq, for example).\footnote{Slind |
259 already pointed out this problem in an email to the HOL4 mailing list on |
260 already pointed out this problem in an email to the HOL4 mailing list on |
260 21st April 2005.} |
261 21st April 2005.} |
261 |
262 |
262 |
263 An alternative, which provides us with a single type for states of automata, |
263 An alternative, which provides us with a single type for automata, is to give every |
264 is to give every state node an identity, for example a natural number, and |
264 state node an identity, for example a natural |
265 then be careful to rename these identities apart whenever connecting two |
265 number, and then be careful to rename these identities apart whenever |
266 automata. This results in clunky proofs establishing that properties are |
266 connecting two automata. This results in clunky proofs |
267 invariant under renaming. Similarly, connecting two automata represented as |
267 establishing that properties are invariant under renaming. Similarly, |
268 matrices results in very adhoc constructions, which are not pleasant to |
268 connecting two automata represented as matrices results in very adhoc |
269 reason about. |
269 constructions, which are not pleasant to reason about. |
|
270 |
270 |
271 Functions are much better supported in Isabelle/HOL, but they still lead to similar |
271 Functions are much better supported in Isabelle/HOL, but they still lead to similar |
272 problems as with graphs. Composing, for example, two non-deterministic automata in parallel |
272 problems as with graphs. Composing, for example, two non-deterministic automata in parallel |
273 requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98} |
273 requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98} |
274 dismisses for this the option of using identities, because it leads according to |
274 dismisses for this the option of using identities, because it leads according to |
465 Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for |
465 Using this property we can show that @{term "B \<cdot> (A \<up> n) \<subseteq> X"} holds for |
466 all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition |
466 all @{text n}. From this we can infer @{term "B \<cdot> A\<star> \<subseteq> X"} using the definition |
467 of @{text "\<star>"}. |
467 of @{text "\<star>"}. |
468 For the inclusion in the other direction we assume a string @{text s} |
468 For the inclusion in the other direction we assume a string @{text s} |
469 with length @{text k} is an element in @{text X}. Since @{thm (prem 1) arden} |
469 with length @{text k} is an element in @{text X}. Since @{thm (prem 1) arden} |
470 we know by Prop.~\ref{langprops}@{text "(ii)"} that |
470 we know by Property~\ref{langprops}@{text "(ii)"} that |
471 @{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k} |
471 @{term "s \<notin> X \<cdot> (A \<up> Suc k)"} since its length is only @{text k} |
472 (the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer). |
472 (the strings in @{term "X \<cdot> (A \<up> Suc k)"} are all longer). |
473 From @{text "(*)"} it follows then that |
473 From @{text "(*)"} it follows then that |
474 @{term s} must be an element in @{term "(\<Union>m\<in>{0..k}. B \<cdot> (A \<up> m))"}. This in turn |
474 @{term s} must be an element in @{term "(\<Union>m\<in>{0..k}. B \<cdot> (A \<up> m))"}. This in turn |
475 implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Prop.~\ref{langprops}@{text "(iii)"} |
475 implies that @{term s} is in @{term "(\<Union>n. B \<cdot> (A \<up> n))"}. Using Property~\ref{langprops}@{text "(iii)"} |
476 this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show. |
476 this is equal to @{term "B \<cdot> A\<star>"}, as we needed to show. |
477 \end{proof} |
477 \end{proof} |
478 |
478 |
479 \noindent |
479 \noindent |
480 Regular expressions are defined as the inductive datatype |
480 Regular expressions are defined as the inductive datatype |
889 \begin{center} |
889 \begin{center} |
890 @{thm Solve_def} |
890 @{thm Solve_def} |
891 \end{center} |
891 \end{center} |
892 |
892 |
893 \noindent |
893 \noindent |
894 We are not concerned here with the definition of this operator |
894 We are not concerned here with the definition of this operator (see |
895 (see Berghofer and Nipkow \cite{BerghoferNipkow00}), but note that we eliminate |
895 Berghofer and Nipkow \cite{BerghoferNipkow00} for example), but note that we |
896 in each @{const Iter}-step a single equation, and therefore |
896 eliminate in each @{const Iter}-step a single equation, and therefore have a |
897 have a well-founded termination order by taking the cardinality |
897 well-founded termination order by taking the cardinality of the equational |
898 of the equational system @{text ES}. This enables us to prove |
898 system @{text ES}. This enables us to prove properties about our definition |
899 properties about our definition of @{const Solve} when we `call' it with |
899 of @{const Solve} when we `call' it with the equivalence class @{text X} and |
900 the equivalence class @{text X} and the initial equational system |
900 the initial equational system @{term "Init (UNIV // \<approx>A)"} from |
901 @{term "Init (UNIV // \<approx>A)"} from |
|
902 \eqref{initcs} using the principle: |
901 \eqref{initcs} using the principle: |
903 % |
902 |
904 \begin{equation}\label{whileprinciple} |
903 \begin{equation}\label{whileprinciple} |
905 \mbox{\begin{tabular}{l} |
904 \mbox{\begin{tabular}{l} |
906 @{term "invariant (Init (UNIV // \<approx>A))"} \\ |
905 @{term "invariant (Init (UNIV // \<approx>A))"} \\ |
907 @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> invariant (Iter X ES)"}\\ |
906 @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> invariant (Iter X ES)"}\\ |
908 @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> card (Iter X ES) < card ES"}\\ |
907 @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> card (Iter X ES) < card ES"}\\ |
1030 and @{term "invariant {(X, rhs)}"}. |
1029 and @{term "invariant {(X, rhs)}"}. |
1031 \end{lmm} |
1030 \end{lmm} |
1032 |
1031 |
1033 \begin{proof} |
1032 \begin{proof} |
1034 In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly |
1033 In order to prove this lemma using \eqref{whileprinciple}, we have to use a slightly |
1035 stronger invariant since Lem.~\ref{iterone} and \ref{itertwo} have the precondition |
1034 stronger invariant since Lemma~\ref{iterone} and \ref{itertwo} have the precondition |
1036 that @{term "(X, rhs) \<in> ES"} for some @{text rhs}. This precondition is needed |
1035 that @{term "(X, rhs) \<in> ES"} for some @{text rhs}. This precondition is needed |
1037 in order to choose in the @{const Iter}-step an equation that is not \mbox{@{term "X = rhs"}}. |
1036 in order to choose in the @{const Iter}-step an equation that is not \mbox{@{term "X = rhs"}}. |
1038 Therefore our invariant cannot be just @{term "invariant ES"}, but must be |
1037 Therefore our invariant cannot be just @{term "invariant ES"}, but must be |
1039 @{term "invariant ES \<and> (\<exists>rhs. (X, rhs) \<in> ES)"}. By assumption |
1038 @{term "invariant ES \<and> (\<exists>rhs. (X, rhs) \<in> ES)"}. By assumption |
1040 @{thm (prem 2) Solve} and Lem.~\ref{invzero}, the more general invariant holds for |
1039 @{thm (prem 2) Solve} and Lemma~\ref{invzero}, the more general invariant holds for |
1041 the initial equational system. This is premise 1 of~\eqref{whileprinciple}. |
1040 the initial equational system. This is premise 1 of~\eqref{whileprinciple}. |
1042 Premise 2 is given by Lem.~\ref{iterone} and the fact that @{const Iter} might |
1041 Premise 2 is given by Lemma~\ref{iterone} and the fact that @{const Iter} might |
1043 modify the @{text rhs} in the equation @{term "X = rhs"}, but does not remove it. |
1042 modify the @{text rhs} in the equation @{term "X = rhs"}, but does not remove it. |
1044 Premise 3 of~\eqref{whileprinciple} is by Lem.~\ref{itertwo}. Now in premise 4 |
1043 Premise 3 of~\eqref{whileprinciple} is by Lemma~\ref{itertwo}. Now in premise 4 |
1045 we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"} |
1044 we like to show that there exists a @{text rhs} such that @{term "ES = {(X, rhs)}"} |
1046 and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"} |
1045 and that @{text "invariant {(X, rhs)}"} holds, provided the condition @{text "Cond"} |
1047 does not holds. By the stronger invariant we know there exists such a @{text "rhs"} |
1046 does not holds. By the stronger invariant we know there exists such a @{text "rhs"} |
1048 with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality |
1047 with @{term "(X, rhs) \<in> ES"}. Because @{text Cond} is not true, we know the cardinality |
1049 of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"}, |
1048 of @{text ES} is @{text 1}. This means @{text "ES"} must actually be the set @{text "{(X, rhs)}"}, |
1064 By the preceding lemma, we know that there exists a @{text "rhs"} such |
1063 By the preceding lemma, we know that there exists a @{text "rhs"} such |
1065 that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"}, |
1064 that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"}, |
1066 and that the invariant holds for this equation. That means we |
1065 and that the invariant holds for this equation. That means we |
1067 know @{text "X = \<Union>\<calL> ` rhs"}. We further know that |
1066 know @{text "X = \<Union>\<calL> ` rhs"}. We further know that |
1068 this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the |
1067 this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the |
1069 invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"}, |
1068 invariant and Lemma~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"}, |
1070 we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation |
1069 we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation |
1071 removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}. |
1070 removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}. |
1072 This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}. |
1071 This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}. |
1073 So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = lang (\<Uplus>rs)"}. |
1072 So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = lang (\<Uplus>rs)"}. |
1074 With this we can conclude the proof. |
1073 With this we can conclude the proof. |
1075 \end{proof} |
1074 \end{proof} |
1076 |
1075 |
1077 \noindent |
1076 \noindent |
1078 Lem.~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction |
1077 Lemma~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction |
1079 of the Myhill-Nerode theorem. |
1078 of the Myhill-Nerode theorem. |
1080 |
1079 |
1081 \begin{proof}[of Thm.~\ref{myhillnerodeone}] |
1080 \begin{proof}[Proof of Theorem~\ref{myhillnerodeone}] |
1082 By Lem.~\ref{every_eqcl_has_reg} we know that there exists a regular expression for |
1081 By Lemma~\ref{every_eqcl_has_reg} we know that there exists a regular expression for |
1083 every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is |
1082 every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is |
1084 a subset of @{term "UNIV // \<approx>A"}, we also know that for every equivalence class |
1083 a subset of @{term "UNIV // \<approx>A"}, we also know that for every equivalence class |
1085 in @{term "finals A"} there exists a regular expression. Moreover by assumption |
1084 in @{term "finals A"} there exists a regular expression. Moreover by assumption |
1086 we know that @{term "finals A"} must be finite, and therefore there must be a finite |
1085 we know that @{term "finals A"} must be finite, and therefore there must be a finite |
1087 set of regular expressions @{text "rs"} such that |
1086 set of regular expressions @{text "rs"} such that |
1223 We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have |
1221 We set in \eqref{finiteimageD}, @{text f} to be @{text "X \<mapsto> tag ` X"}. We have |
1224 @{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be |
1222 @{text "range f"} to be a subset of @{term "Pow (range tag)"}, which we know must be |
1225 finite by assumption. Now @{term "f (UNIV // =tag=)"} is a subset of @{text "range f"}, |
1223 finite by assumption. Now @{term "f (UNIV // =tag=)"} is a subset of @{text "range f"}, |
1226 and so also finite. Injectivity amounts to showing that @{text "X = Y"} under the |
1224 and so also finite. Injectivity amounts to showing that @{text "X = Y"} under the |
1227 assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}. |
1225 assumptions that @{text "X, Y \<in> "}~@{term "UNIV // =tag="} and @{text "f X = f Y"}. |
1228 From the assumptions we can obtain @{text "x \<in> X"} and @{text "y \<in> Y"} with |
1226 From the assumptions we obtain \mbox{@{text "x \<in> X"}} and @{text "y \<in> Y"} with |
1229 @{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in |
1227 @{text "tag x = tag y"}. Since @{text x} and @{text y} are tag-related, this in |
1230 turn means that the equivalence classes @{text X} |
1228 turn means that the equivalence classes @{text X} |
1231 and @{text Y} must be equal. |
1229 and @{text Y} must be equal. Therefore \eqref{finiteimageD} allows us to conclude |
|
1230 with @{thm (concl) finite_eq_tag_rel}. |
1232 \end{proof} |
1231 \end{proof} |
1233 |
1232 |
1234 \begin{lmm}\label{fintwo} |
1233 \begin{lmm}\label{fintwo} |
1235 Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby |
1234 Given two equivalence relations @{text "R\<^isub>1"} and @{text "R\<^isub>2"}, whereby |
1236 @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}. |
1235 @{text "R\<^isub>1"} refines @{text "R\<^isub>2"}. |
1500 @{text "x\<^isub>p < x"} and the rest @{term "x\<^isub>s @ z \<in> A\<star>"}. For example the empty string |
1499 @{text "x\<^isub>p < x"} and the rest @{term "x\<^isub>s @ z \<in> A\<star>"}. For example the empty string |
1501 @{text "[]"} would do (recall @{term "x \<noteq> []"}). |
1500 @{text "[]"} would do (recall @{term "x \<noteq> []"}). |
1502 There are potentially many such prefixes, but there can only be finitely many of them (the |
1501 There are potentially many such prefixes, but there can only be finitely many of them (the |
1503 string @{text x} is finite). Let us therefore choose the longest one and call it |
1502 string @{text x} is finite). Let us therefore choose the longest one and call it |
1504 @{text "x\<^bsub>pmax\<^esub>"}. Now for the rest of the string @{text "x\<^isub>s @ z"} we |
1503 @{text "x\<^bsub>pmax\<^esub>"}. Now for the rest of the string @{text "x\<^isub>s @ z"} we |
1505 know it is in @{term "A\<star>"} and cannot be the empty string. By Prop.~\ref{langprops}@{text "(iv)"}, |
1504 know it is in @{term "A\<star>"} and cannot be the empty string. By Property~\ref{langprops}@{text "(iv)"}, |
1506 we can separate |
1505 we can separate |
1507 this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \<noteq> []"}, @{text "a \<in> A"} |
1506 this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \<noteq> []"}, @{text "a \<in> A"} |
1508 and @{term "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x\<^isub>s"}, |
1507 and @{term "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x\<^isub>s"}, |
1509 otherwise @{text "x\<^bsub>pmax\<^esub>"} is not the longest prefix. That means @{text a} |
1508 otherwise @{text "x\<^bsub>pmax\<^esub>"} is not the longest prefix. That means @{text a} |
1510 `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and |
1509 `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and |
1614 % & @{thm (rhs) Ders_simps(3)[where ?s1.0="s\<^isub>1" and ?s2.0="s\<^isub>2"]}\\ |
1613 % & @{thm (rhs) Ders_simps(3)[where ?s1.0="s\<^isub>1" and ?s2.0="s\<^isub>2"]}\\ |
1615 \end{tabular}} |
1614 \end{tabular}} |
1616 \end{equation} |
1615 \end{equation} |
1617 |
1616 |
1618 \noindent |
1617 \noindent |
1619 where @{text "\<Delta>"} in the fifth line is a function that tests whether the empty string |
1618 where @{text "\<Delta>"} in the fifth line is a function that tests whether the |
1620 is in the language and returns @{term "{[]}"} or @{term "{}"}, accordingly. |
1619 empty string is in the language and returns @{term "{[]}"} or @{term "{}"}, |
1621 The only interesting case is the last one where we use Prop.~\ref{langprops}@{text "(i)"} |
1620 accordingly. In the last equation we use the list-cons operator written |
1622 in order to infer that @{term "Der c (A\<star>) = Der c (A \<cdot> A\<star>)"}. We can |
1621 \mbox{@{text "_ :: _"}}. The only interesting case is the @{text "A\<star>"}-case |
1623 then complete the proof by noting that @{term "Delta A \<cdot> Der c (A\<star>) \<subseteq> (Der c A) \<cdot> A\<star>"}. |
1622 where we use Property~\ref{langprops}@{text "(i)"} in order to infer that |
1624 |
1623 @{term "Der c (A\<star>) = Der c (A \<cdot> A\<star>)"}. We can then complete the proof by |
1625 Brzozowski observed that the left-quotients for languages of regular |
1624 using the fifth equation and noting that @{term "Delta A \<cdot> Der c (A\<star>) \<subseteq> (Der |
1626 expressions can be calculated directly using the notion of \emph{derivatives |
1625 c A) \<cdot> A\<star>"}. |
1627 of a regular expression} \cite{Brzozowski64}. We define this notion in |
1626 |
1628 Isabelle/HOL as follows: |
1627 Brzozowski observed that the left-quotients for languages of |
|
1628 regular expressions can be calculated directly using the notion of |
|
1629 \emph{derivatives of a regular expression} \cite{Brzozowski64}. We define |
|
1630 this notion in Isabelle/HOL as follows: |
1629 |
1631 |
1630 \begin{center} |
1632 \begin{center} |
1631 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}} |
1633 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}} |
1632 @{thm (lhs) der.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) der.simps(1)}\\ |
1634 @{thm (lhs) der.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) der.simps(1)}\\ |
1633 @{thm (lhs) der.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) der.simps(2)}\\ |
1635 @{thm (lhs) der.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) der.simps(2)}\\ |
1927 operations. Closure under union, concatenation and Kleene-star are trivial |
1930 operations. Closure under union, concatenation and Kleene-star are trivial |
1928 to establish given our definition of regularity (recall Definition~\ref{regular}). |
1931 to establish given our definition of regularity (recall Definition~\ref{regular}). |
1929 More interesting is the closure under complement, because it seems difficult |
1932 More interesting is the closure under complement, because it seems difficult |
1930 to construct a regular expression for the complement language by direct |
1933 to construct a regular expression for the complement language by direct |
1931 means. However the existence of such a regular expression can now be easily |
1934 means. However the existence of such a regular expression can now be easily |
1932 proved using the Myhill-Nerode theorem since |
1935 proved using both parts of the Myhill-Nerode theorem, since |
1933 |
1936 |
1934 \begin{center} |
1937 \begin{center} |
1935 @{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"} |
1938 @{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"} |
1936 \end{center} |
1939 \end{center} |
1937 |
1940 |
1938 \noindent |
1941 \noindent |
1939 holds for any strings @{text "s\<^isub>1"} and @{text |
1942 holds for any strings @{text "s\<^isub>1"} and @{text |
1940 "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"} |
1943 "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"} |
1941 give rise to the same partitions. So if one is finite, the other is too, and |
1944 give rise to the same partitions. So if one is finite, the other is too, and |
1942 the other way around. Proving the existence of such a regular expression via |
1945 vice versa. Proving the existence of such a regular expression via |
1943 automata using the standard method would be quite involved. It includes the |
1946 automata using the standard method would be quite involved. It includes the |
1944 steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text |
1947 steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text |
1945 "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"} |
1948 "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"} |
1946 regular expression. Clearly not something you want to formalise in a theorem |
1949 regular expression. Clearly not something you want to formalise in a theorem |
1947 prover in which it is cumbersome to reason about automata. |
1950 prover in which it is cumbersome to reason about automata. |
2038 \noindent |
2041 \noindent |
2039 Having formalised this theorem means we pushed our point of view quite |
2042 Having formalised this theorem means we pushed our point of view quite |
2040 far. Using this theorem we can obviously prove when a language is \emph{not} |
2043 far. Using this theorem we can obviously prove when a language is \emph{not} |
2041 regular---by establishing that it has infinitely many equivalence classes |
2044 regular---by establishing that it has infinitely many equivalence classes |
2042 generated by the Myhill-Nerode relation (this is usually the purpose of the |
2045 generated by the Myhill-Nerode relation (this is usually the purpose of the |
2043 pumping lemma \cite{Kozen97}). We can also use it to establish the standard |
2046 Pumping Lemma \cite{Kozen97}). We can also use it to establish the standard |
2044 textbook results about closure properties of regular languages. Interesting |
2047 textbook results about closure properties of regular languages. Interesting |
2045 is the case of closure under complement, because it seems difficult to |
2048 is the case of closure under complement, because it seems difficult to |
2046 construct a regular expression for the complement language by direct |
2049 construct a regular expression for the complement language by direct |
2047 means. However the existence of such a regular expression can be easily |
2050 means. However the existence of such a regular expression can be easily |
2048 proved using the Myhill-Nerode theorem. |
2051 proved using the Myhill-Nerode theorem. |
2049 |
2052 |
2050 Our insistence on regular expressions for proving the Myhill-Nerode theorem |
2053 Our insistence on regular expressions for proving the Myhill-Nerode theorem |
2051 arose from the limitations of HOL on which the popular theorem provers HOL4, |
2054 arose from the limitations of HOL, on which the popular theorem provers HOL4, |
2052 HOLlight and Isabelle/HOL are based. In order to guarantee consistency, |
2055 HOLlight and Isabelle/HOL are based. In order to guarantee consistency, |
2053 formalisations can only extend HOL by definitions that introduce a notion in |
2056 formalisations can only extend HOL by definitions that introduce a new concept in |
2054 terms of already existing concepts. A convenient definition for automata |
2057 terms of already existing notions. A convenient definition for automata |
2055 (based on graphs) uses a polymorphic type for the state nodes. This allows |
2058 (based on graphs) uses a polymorphic type for the state nodes. This allows |
2056 us to use the standard operation of disjoint union in order to compose two |
2059 us to use the standard operation of disjoint union in order to compose two |
2057 automata. Unfortunately, we cannot use such a polymorphic definition of |
2060 automata. Unfortunately, we cannot use such a polymorphic definition |
2058 in HOL as part of the definition for regularity of a language (a |
2061 in HOL as part of the definition for regularity of a language (a |
2059 set of strings). Consider the following attempt |
2062 set of strings). Consider the following attempt |
2060 |
2063 |
2061 \begin{center} |
2064 \begin{center} |
2062 @{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_finite_automata (M) \<and> \<calL>(M) = A"} |
2065 @{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_finite_automata (M) \<and> \<calL>(M) = A"} |
2063 \end{center} |
2066 \end{center} |
2064 |
2067 |
2065 \noindent |
2068 \noindent |
2066 which means the definiens is polymorphic in the type of the automata @{text |
2069 which means the definiens is polymorphic in the type of the automata @{text |
2067 "M"}, but the definiendum @{text "is_regular"} is not. Such definitions are |
2070 "M"}, but the definiendum @{text "is_regular"} is not. Such definitions are |
2068 excluded in HOL, because they lead easily to inconsistencies (see |
2071 excluded in HOL, because they can lead easily to inconsistencies (see |
2069 \cite{PittsHOL4} for a simple example). Also HOL does not contain |
2072 \cite{PittsHOL4} for a simple example). Also HOL does not contain |
2070 type-quantifiers which would allow us to get rid of the polymorphism by |
2073 type-quantifiers which would allow us to get rid of the polymorphism by |
2071 quantifying over the type-variable @{text "\<alpha>"}. Therefore when defining |
2074 quantifying over the type-variable @{text "\<alpha>"}. Therefore when defining |
2072 regularity in terms of automata, the only natural way out in HOL is to use state |
2075 regularity in terms of automata, the only natural way out in HOL is to use |
2073 nodes with an identity, for example a natural number. Unfortunatly, the |
2076 state nodes with an identity, for example a natural number. Unfortunatly, |
2074 consequence is that we have to be careful when combining two automata so |
2077 the consequence is that we have to be careful when combining two automata so |
2075 that there is no clash between two such states. This makes formalisations |
2078 that there is no clash between two such states. This makes formalisations |
2076 quite fiddly and unpleasant. Regular expressions proved much more convenient |
2079 quite fiddly and rather unpleasant. Regular expressions proved much more |
2077 for reasoning in HOL and we showed they can be used for establishing the |
2080 convenient for reasoning in HOL and we showed they can be used for |
2078 Myhill-Nerode theorem. |
2081 establishing the central result in regular language theory: the Myhill-Nerode |
|
2082 theorem. |
2079 |
2083 |
2080 While regular expressions are convenient, they have some limitations. One is |
2084 While regular expressions are convenient, they have some limitations. One is |
2081 that there seems to be no method of calculating a minimal regular expression |
2085 that there seems to be no method of calculating a minimal regular expression |
2082 (for example in terms of length) for a regular language, like there is for |
2086 (for example in terms of length) for a regular language, like there is for |
2083 automata. On the other hand, efficient regular expression matching, without |
2087 automata. On the other hand, efficient regular expression matching, without |
2084 using automata, poses no problem \cite{OwensReppyTuron09}. For an |
2088 using automata, poses no problem \cite{OwensReppyTuron09}. For an |
2085 implementation of a simple regular expression matcher, whose correctness has |
2089 implementation of a simple regular expression matcher, whose correctness has |
2086 been formally established, we refer the reader to Owens and Slind |
2090 been formally established, we refer the reader to Owens and Slind |
2087 \cite{OwensSlind08}. |
2091 \cite{OwensSlind08}. |
2088 |
|
2089 Our formalisation consists of 780 lines of Isabelle/Isar code for the first |
|
2090 direction and 460 for the second, plus around 300 lines of standard material |
|
2091 about regular languages. The formalisation about derivatives and partial |
|
2092 derivatives shown in Section~\ref{derivatives} consists of 390 lines of |
|
2093 code. The algorithm for solving equational systems, which we used in the |
|
2094 first direction, is conceptually not that complicated. Still the use of sets |
|
2095 over which the algorithm operates, means it is not as easy to formalise as |
|
2096 one might wish. It seems sets cannot be avoided since the `input' of the |
|
2097 algorithm consists of equivalence classes and we cannot see how to |
|
2098 reformulate the theory so that we can use lists, which are usually easier to |
|
2099 reason about in a theorem prover. |
|
2100 |
|
2101 While our formalisation might be seen large, it should be seen |
|
2102 in the context of the work done by Constable at al \cite{Constable00} who |
|
2103 formalised the Myhill-Nerode theorem in Nuprl using automata. They write |
|
2104 that their four-member team needed something on the magnitude of 18 months |
|
2105 for their formalisation. The estimate for our formalisation is that we |
|
2106 needed approximately 3 months and this included the time to find our proof |
|
2107 arguments. Unlike Constable et al, who were able to follow the proofs from |
|
2108 \cite{HopcroftUllman69}, we had to find our own arguments. So for us the |
|
2109 formalisation was not the bottleneck. It is hard to gauge the size of a |
|
2110 formalisation in Nurpl, but from what is shown in the Nuprl Math Library |
|
2111 about their development it seems substantially larger than ours. The code of |
|
2112 ours can be found in the Mercurial Repository at |
|
2113 \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}. |
|
2114 |
2092 |
2115 Our proof of the first direction is very much inspired by \emph{Brzozowski's |
2093 Our proof of the first direction is very much inspired by \emph{Brzozowski's |
2116 algebraic method} used to convert a finite automaton to a regular expression |
2094 algebraic method} used to convert a finite automaton to a regular expression |
2117 \cite{Brzozowski64}. The close connection can be seen by considering the |
2095 \cite{Brzozowski64}. The close connection can be seen by considering the |
2118 equivalence classes as the states of the minimal automaton for the regular |
2096 equivalence classes as the states of the minimal automaton for the regular |
2128 in the literature about this way of proving the first direction of the |
2106 in the literature about this way of proving the first direction of the |
2129 Myhill-Nerode theorem, but it appears to be folklore. |
2107 Myhill-Nerode theorem, but it appears to be folklore. |
2130 |
2108 |
2131 We presented two proofs for the second direction of the Myhill-Nerode |
2109 We presented two proofs for the second direction of the Myhill-Nerode |
2132 theorem. One direct proof using tagging-functions and another using partial |
2110 theorem. One direct proof using tagging-functions and another using partial |
2133 derivatives. These proofs is where our method shines, because we can |
2111 derivatives. This part of our work is where our method using regular |
2134 completely side-step the standard argument \cite{Kozen97} where automata |
2112 expressions shines, because we can completely side-step the standard |
2135 need to be composed. However, it is also the direction where we had to spend |
2113 argument \cite{Kozen97} where automata need to be composed. However, it is |
2136 most of the `conceptual' time, as our first proof based on |
2114 also the direction where we had to spend most of the `conceptual' time, as |
2137 tagging-functions is new for establishing the Myhill-Nerode theorem. All |
2115 our first proof based on tagging-functions is new for establishing the |
2138 standard proofs of this direction proceed by arguments over automata. |
2116 Myhill-Nerode theorem. All standard proofs of this direction proceed by |
2139 |
2117 arguments over automata. |
2140 Our indirect proof for the second direction arose from the interest in |
2118 |
|
2119 The indirect proof for the second direction arose from our interest in |
2141 Brzozowski's derivatives for regular expression matching. A corresponding |
2120 Brzozowski's derivatives for regular expression matching. A corresponding |
2142 regular expression matcher has been formalised in HOL4 in |
2121 regular expression matcher has been formalised by Owens and Slind in HOL4 |
2143 \cite{OwensSlind08}. In our opinion, this formalisation is considerably |
2122 \cite{OwensSlind08}. In our opinion, their formalisation is considerably |
2144 slicker than for example the approach to regular expression matching taken |
2123 slicker than for example the approach to regular expression matching taken |
2145 in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to |
2124 in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to a |
2146 simple regular expression matchers and he proved that there are only |
2125 simple regular expression matcher and he established that there are only |
2147 finitely many dissimilar derivatives for every regular expression, this |
2126 finitely many dissimilar derivatives for every regular expression, this |
2148 result is not as straightforward to formalise in a theorem prover. The |
2127 result is not as straightforward to formalise in a theorem prover. The |
2149 reason is that the set of dissimilar derivatives is not defined inductively, |
2128 reason is that the set of dissimilar derivatives is not defined inductively, |
2150 but in terms of an ACI-equivalence relation. |
2129 but in terms of an ACI-equivalence relation. This difficulty prevented for |
2151 |
2130 example Krauss and Nipkow to prove termination of their equivalence checker |
2152 |
2131 for regular expressions \cite{KraussNipkow11}. Their checker is based on |
2153 |
2132 derivatives and for their argument the lack of a formal proof of termination |
2154 \medskip |
2133 is not crucial (it merely lets them ``sleep better'' \cite{KraussNipkow11}). |
2155 |
2134 We expect that their development simplifies by using partial derivatives, |
2156 We expect that the development of Krauss \& Nipkow gets easier by |
2135 instead of derivatives, and that termination of the algorithm can be |
2157 using partial derivatives.\medskip |
2136 formally established. However, since partial derivatives use sets of regular |
|
2137 expressions, one needs to carefully analyse whether the resulting algorithm |
|
2138 is still executable. Given the existing infrastructure for executable sets |
|
2139 in Isabelle/HOL, it should. |
|
2140 |
|
2141 Our formalisation of the Myhill-Nerode theorem consists of 780 lines of |
|
2142 Isabelle/Isar code for the first direction and 460 for the second (the one |
|
2143 based on tagging functions), plus around 300 lines of standard material |
|
2144 about regular languages. The formalisation about derivatives and partial |
|
2145 derivatives shown in Section~\ref{derivatives} consists of 390 lines of |
|
2146 code. The algorithm for solving equational systems, which we used in the |
|
2147 first direction, is conceptually relatively simple. Still the use of sets |
|
2148 over which the algorithm operates means it is not as easy to formalise as |
|
2149 one might hope. However, it seems sets cannot be avoided since the `input' |
|
2150 of the algorithm consists of equivalence classes and we cannot see how to |
|
2151 reformulate the theory so that we can use lists. Lists would be much easier |
|
2152 to reason about, since we can define function over them by recursion. For |
|
2153 sets we have to use set-comprehensions. |
|
2154 |
|
2155 While our formalisation might be seen large, it should be seen |
|
2156 in the context of the work done by Constable at al \cite{Constable00} who |
|
2157 formalised the Myhill-Nerode theorem in Nuprl using automata. They write |
|
2158 that their four-member team needed something on the magnitude of 18 months |
|
2159 for their formalisation. The estimate for our formalisation is that we |
|
2160 needed approximately 3 months and this included the time to find our proof |
|
2161 arguments. Unlike Constable et al, who were able to follow the proofs from |
|
2162 \cite{HopcroftUllman69}, we had to find our own arguments. So for us the |
|
2163 formalisation was not the bottleneck. It is hard to gauge the size of a |
|
2164 formalisation in Nurpl, but from what is shown in the Nuprl Math Library |
|
2165 about their development it seems substantially larger than ours. The code of |
|
2166 ours can be found in the Mercurial Repository at |
|
2167 \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.\medskip |
2158 |
2168 |
2159 \noindent |
2169 \noindent |
2160 {\bf Acknowledgements:} |
2170 {\bf Acknowledgements:} |
2161 We are grateful for the comments we received from Larry |
2171 We are grateful for the comments we received from Larry |
2162 Paulson. |
2172 Paulson. |