regexp: comparison Journal/Paper.thy

equal deleted inserted replaced

-:11c3c302fa2e
+:204856ef5573
 based on Church's Simple Theory of Types \cite{Church40}.  Although many
 mathematical concepts can be conveniently expressed in HOL, there are some
 limitations that hurt badly, if one attempts a simple-minded formalisation
 of regular languages in it.
-The typical approach to regular languages is to
+The typical approach to regular languages is to introduce finite
-introduce finite automata and then define everything in terms of them
+deterministic automata and then define everything in terms of them \cite{
-\cite{ HopcroftUllman69,Kozen97}.  For example, a regular language is
+HopcroftUllman69,Kozen97}.  For example, a regular language is normally
-normally defined as:
+defined as:
 \begin{dfntn}\label{baddef}
 A language @{text A} is \emph{regular}, provided there is a
 finite deterministic automaton that recognises all strings of @{text "A"}.
 \end{dfntn}
 \end{equation}
 \noindent
 changes the type---the disjoint union is not a set, but a set of
 pairs. Using this definition for disjoint union means we do not have a
-single type for the states of automata. As a result we will not be able to define a
+single type for the states of automata. As a result we will not be able to
-regular language as one for which there exists an automaton that recognises
+define in our fomalisation a regular language as one for which there exists
-all its strings (Definition~\ref{baddef}). This is because we cannot make a
+an automaton that recognises all its strings (Definition~\ref{baddef}). This
-definition in HOL that is only polymorphic in the state type and there is no type
+is because we cannot make a definition in HOL that is only polymorphic in
-quantification available in HOL (unlike in Coq, for example).\footnote{Slind
+the state type, but not in the predicate for regularity; and there is no
-already pointed out this problem in an email to the HOL4 mailing list on
+type quantification available in HOL (unlike in Coq, for
-21st April 2005.}
+example).\footnote{Slind already pointed out this problem in an email to the
+HOL4 mailing list on 21st April 2005.}
 An alternative, which provides us with a single type for states of automata,
 is to give every state node an identity, for example a natural number, and
 then be careful to rename these identities apart whenever connecting two
 automata. This results in clunky proofs establishing that properties are
 invariant under renaming. Similarly, connecting two automata represented as
 matrices results in very adhoc constructions, which are not pleasant to
 reason about.
-Functions are much better supported in Isabelle/HOL, but they still lead to similar
+Functions are much better supported in Isabelle/HOL, but they still lead to
-problems as with graphs.  Composing, for example, two non-deterministic automata in parallel
+similar problems as with graphs.  Composing, for example, two
-requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98}
+non-deterministic automata in parallel requires also the formalisation of
-dismisses for this the option of using identities, because it leads according to
+disjoint unions. Nipkow \cite{Nipkow98} dismisses for this the option of
-him to ``messy proofs''. Since he does not need to define what regular
+using identities, because it leads according to him to ``messy
-languages are, Nipkow opts for a variant of \eqref{disjointunion} using bit lists, but writes
+proofs''. Since he does not need to define what regular languages are,
+Nipkow opts for a variant of \eqref{disjointunion} using bit lists, but
+writes\smallskip
 \begin{quote}
 \it%
 \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}}
 `` & All lemmas appear obvious given a picture of the composition of automata\ldots
 Yet their proofs require a painful amount of detail.''
 \end{tabular}
-\end{quote}
+\end{quote}\smallskip
 \noindent
-and
+and\smallskip
 \begin{quote}
 \it%
 \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}}
 `` & If the reader finds the above treatment in terms of bit lists revoltingly
 concrete, I cannot disagree. A more abstract approach is clearly desirable.''
 \end{tabular}
-\end{quote}
+\end{quote}\smallskip
 \noindent
 Moreover, it is not so clear how to conveniently impose a finiteness
 condition upon functions in order to represent \emph{finite} automata. The
 this string to be in @{term "A \<cdot> B"}:
 %
 \begin{center}
 \begin{tabular}{c}
 \scalebox{1}{
-\begin{tikzpicture}
+\begin{tikzpicture}[fill=gray!20]
-\node[draw,minimum height=3.8ex] (x) { $\hspace{4.8em}@{text x}\hspace{4.8em}$ };
+\node[draw,minimum height=3.8ex, fill] (x) { $\hspace{4.8em}@{text x}\hspace{4.8em}$ };
-\node[draw,minimum height=3.8ex, right=-0.03em of x] (za) { $\hspace{0.6em}@{text "z\<^isub>p"}\hspace{0.6em}$ };
+\node[draw,minimum height=3.8ex, right=-0.03em of x, fill] (za) { $\hspace{0.6em}@{text "z\<^isub>p"}\hspace{0.6em}$ };
-\node[draw,minimum height=3.8ex, right=-0.03em of za] (zza) { $\hspace{2.6em}@{text "z\<^isub>s"}\hspace{2.6em}$  };
+\node[draw,minimum height=3.8ex, right=-0.03em of za, fill] (zza) { $\hspace{2.6em}@{text "z\<^isub>s"}\hspace{2.6em}$  };
 \draw[decoration={brace,transform={yscale=3}},decorate]
 (x.north west) -- ($(za.north west)+(0em,0em)$)
 node[midway, above=0.5em]{@{text x}};
 ($(zza.south east)+(0em,0ex)$) -- ($(za.south east)+(0em,0ex)$)
 node[midway, below=0.5em]{@{text "z\<^isub>s \<in> B"}};
 \end{tikzpicture}}
 \\[2mm]
 \scalebox{1}{
-\begin{tikzpicture}
+\begin{tikzpicture}[fill=gray!20]
-\node[draw,minimum height=3.8ex] (xa) { $\hspace{3em}@{text "x\<^isub>p"}\hspace{3em}$ };
+\node[draw,minimum height=3.8ex, fill] (xa) { $\hspace{3em}@{text "x\<^isub>p"}\hspace{3em}$ };
-\node[draw,minimum height=3.8ex, right=-0.03em of xa] (xxa) { $\hspace{0.2em}@{text "x\<^isub>s"}\hspace{0.2em}$ };
+\node[draw,minimum height=3.8ex, right=-0.03em of xa, fill] (xxa) { $\hspace{0.2em}@{text "x\<^isub>s"}\hspace{0.2em}$ };
-\node[draw,minimum height=3.8ex, right=-0.03em of xxa] (z) { $\hspace{5em}@{text z}\hspace{5em}$ };
+\node[draw,minimum height=3.8ex, right=-0.03em of xxa, fill] (z) { $\hspace{5em}@{text z}\hspace{5em}$ };
 \draw[decoration={brace,transform={yscale=3}},decorate]
 (xa.north west) -- ($(xxa.north east)+(0em,0em)$)
 node[midway, above=0.5em]{@{text x}};
 When analysing the case of @{text "x @ z"} being an element in @{term "A\<star>"}
 and @{text x} is not the empty string, we have the following picture:
 \begin{center}
 \scalebox{1}{
-\begin{tikzpicture}
+\begin{tikzpicture}[fill=gray!20]
-\node[draw,minimum height=3.8ex] (xa) { $\hspace{4em}@{text "x\<^bsub>pmax\<^esub>"}\hspace{4em}$ };
+\node[draw,minimum height=3.8ex, fill] (xa) { $\hspace{4em}@{text "x\<^bsub>pmax\<^esub>"}\hspace{4em}$ };
-\node[draw,minimum height=3.8ex, right=-0.03em of xa] (xxa) { $\hspace{0.5em}@{text "x\<^bsub>s\<^esub>"}\hspace{0.5em}$ };
+\node[draw,minimum height=3.8ex, right=-0.03em of xa, fill] (xxa) { $\hspace{0.5em}@{text "x\<^bsub>s\<^esub>"}\hspace{0.5em}$ };
-\node[draw,minimum height=3.8ex, right=-0.03em of xxa] (za) { $\hspace{2em}@{text "z\<^isub>a"}\hspace{2em}$ };
+\node[draw,minimum height=3.8ex, right=-0.03em of xxa, fill] (za) { $\hspace{2em}@{text "z\<^isub>a"}\hspace{2em}$ };
-\node[draw,minimum height=3.8ex, right=-0.03em of za] (zb) { $\hspace{7em}@{text "z\<^isub>b"}\hspace{7em}$ };
+\node[draw,minimum height=3.8ex, right=-0.03em of za, fill] (zb) { $\hspace{7em}@{text "z\<^isub>b"}\hspace{7em}$ };
 \draw[decoration={brace,transform={yscale=3}},decorate]
 (xa.north west) -- ($(xxa.north east)+(0em,0em)$)
 node[midway, above=0.5em]{@{text x}};
 boolean function @{term "nullable r"} needed in the @{const Times}-case tests
 whether a regular expression can recognise the empty string. It can be defined as
 follows.
 \begin{center}
-\begin{tabular}{c@ {\hspace{10mm}}c}
+\begin{tabular}{c}
 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
 @{thm (lhs) nullable.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(1)}\\
 @{thm (lhs) nullable.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(2)}\\
 @{thm (lhs) nullable.simps(3)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(3)}\\
-\end{tabular} &
-\begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
 @{thm (lhs) nullable.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}
 & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
 @{thm (lhs) nullable.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}
 & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
 @{thm (lhs) nullable.simps(6)}  & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(6)}\\
 \end{equation}
 \noindent
 The importance of this fact in the context of the Myhill-Nerode theorem is that
 we can use \eqref{mhders} and \eqref{Dersders} in order to
-establish that @{term "x \<approx>(lang r) y"} is equivalent to
+establish that
-\mbox{@{term "lang (ders x r) = lang (ders y r)"}}. Hence
+\begin{center}
+@{term "x \<approx>(lang r) y"} \hspace{4mm}if and only if\hspace{4mm}
+@{term "lang (ders x r) = lang (ders y r)"}.
+\end{center}
+\noindent
+holds and hence
 \begin{equation}
 @{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "ders x r = ders y r"}
 \end{equation}
 \noindent
-which means the right-hand side (seen as a relation) refines the Myhill-Nerode
+This means the right-hand side (seen as a relation) refines the Myhill-Nerode
 relation.  Consequently, we can use @{text
 "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as a
 tagging-relation. However, in order to be useful for the second part of the
 Myhill-Nerode theorem, we have to be able to establish that for the
 corresponding language there are only finitely many derivatives---thus
 These two properties confirm the observation made earlier
 that by using sets, partial derivatives have the @{text "ACI"}-identities
 of derivatives already built in.
 Antimirov also proved that for every language and regular expression
-there are only finitely many partial derivatives, whereby the partial
+there are only finitely many partial derivatives, whereby the set of partial
 derivatives of @{text r} w.r.t.~a language @{text A} is defined as
 \begin{equation}\label{Pdersdef}
 @{thm pders_lang_def}
 \end{equation}
 For every language @{text A} and every regular expression @{text r},
 \mbox{@{thm finite_pders_lang}}.
 \end{thrm}
 \noindent
-Antimirov's proof first shows this theorem for the language @{term UNIV1},
+Antimirov's proof first establishes this theorem for the language @{term UNIV1},
 which is the set of all non-empty strings. For this he proves:
 \begin{equation}\label{pdersunivone}
 \mbox{\begin{tabular}{l}
 @{thm pders_lang_Zero}\\
 \noindent
 and for all languages @{text "A"}, @{term "pders_lang A r"} is a subset of
 @{term "pders_lang UNIV r"}.  Since we follow Antimirov's proof quite
 closely in our formalisation (only the last two cases of
-\eqref{pdersunivone} involve some non-routine induction argument), we omit
+\eqref{pdersunivone} involve some non-routine induction arguments), we omit
 the details.
-Let us now return to our proof about the second direction in the Myhill-Nerode
+Let us now return to our proof for the second direction in the Myhill-Nerode
 theorem. The point of the above calculations is to use
 @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation.
 \begin{proof}[Proof of Theorem~\ref{myhillnerodetwo} (second version)]
 @{thm pders_lang_def[where A="UNIV", simplified]}
 \end{center}
 \noindent
 Now the range of @{term "\<lambda>x. pders x r"} is a subset of @{term "Pow (pders_lang UNIV r)"},
-which we know is finite by Theorem~\ref{antimirov}. This means there
+which we know is finite by Theorem~\ref{antimirov}. Consequently there
 are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"},
-which refines @{term "\<approx>(lang r)"}, and consequently we can again conclude the
+which refines @{term "\<approx>(lang r)"}, and therefore we can again conclude the
 second part of the Myhill-Nerode theorem.
 \end{proof}
 *}
-section {* Closure Properties of Regular Languages *}
+section {* Closure Properties of Regular Languages\label{closure} *}
 text {*
 \noindent
 The beauty of regular languages is that they are closed under many set
 operations. Closure under union, concatenation and Kleene-star are trivial
 @{term "A - B = - (- A \<union> B)"}
 \end{tabular}
 \end{center}
 \noindent
-Closure of regular languages under reversal, that is
+Since all finite languages are regular, then by closure under complement also
+all co-finite languages. Closure of regular languages under reversal, that is
 \begin{center}
 @{text "A\<^bsup>-1\<^esup> \<equiv> {s\<^bsup>-1\<^esup> | s \<in> A}"}
 \end{center}
 construct a regular expression for the complement language by direct
 means. However the existence of such a regular expression can be easily
 proved using the Myhill-Nerode theorem.
 Our insistence on regular expressions for proving the Myhill-Nerode theorem
-arose from the limitations of HOL, on which the popular theorem provers HOL4,
+arose from the limitations of HOL, used in the popular theorem provers HOL4,
-HOLlight and Isabelle/HOL are based. In order to guarantee consistency,
+HOLlight and Isabelle/HOL. In order to guarantee consistency,
-formalisations can extend HOL with definitions that introduce a new concept in
+formalisations in HOL can only extend the logic with definitions that introduce a new concept in
-only terms of already existing notions. A convenient definition for automata
+terms of already existing notions. A convenient definition for automata
 (based on graphs) uses a polymorphic type for the state nodes. This allows
 us to use the standard operation for disjoint union whenever we need to compose two
 automata. Unfortunately, we cannot use such a polymorphic definition
 in HOL as part of the definition for regularity of a language (a predicate
 over set of strings).  Consider the following attempt:
 \begin{center}
-@{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_finite_automata (M) \<and> \<calL>(M) = A"}
+@{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_dfa (M) \<and> \<calL>(M) = A"}
 \end{center}
 \noindent
 In this definifion, the definiens is polymorphic in the type of the automata
-@{text "M"} (indicated by the @{text "\<alpha>"}), but the definiendum @{text
+@{text "M"} (indicated by dependency on @{text "\<alpha>"}), but the definiendum
-"is_regular"} is not. Such definitions are excluded in HOL, because they can
+@{text "is_regular"} is not. Such definitions are excluded in HOL, because
-lead easily to inconsistencies (see \cite{PittsHOL4} for a simple
+they can lead easily to inconsistencies (see \cite{PittsHOL4} for a simple
 example). Also HOL does not contain type-quantifiers which would allow us to
 get rid of the polymorphism by quantifying over the type-variable @{text
 "\<alpha>"}. Therefore when defining regularity in terms of automata, the only
 natural way out in HOL is to resort to state nodes with an identity, for
 example a natural number. Unfortunatly, the consequence is that we have to
 be careful when combining two automata so that there is no clash between two
 such states. This makes formalisations quite fiddly and rather
 unpleasant. Regular expressions proved much more convenient for reasoning in
 HOL since they can be defined as inductive datatype and a reasoning
-infrastructure comes for free. We showed they can be used for establishing
+infrastructure comes for free. The definition of regularity in terms of
-the central result in regular language theory---the Myhill-Nerode theorem.
+regular expressions poses no problem at all for HOL.  We showed in this
+paper that they can be used for establishing the central result in regular
+language theory---the Myhill-Nerode theorem.
 While regular expressions are convenient, they have some limitations. One is
 that there seems to be no method of calculating a minimal regular expression
 (for example in terms of length) for a regular language, like there is for
 automata. On the other hand, efficient regular expression matching, without
 Theorem~\ref{antimirov}). However, since partial derivatives use sets of
 regular expressions, one needs to carefully analyse whether the resulting
 algorithm is still executable. Given the existing infrastructure for
 executable sets in Isabelle/HOL, it should.
 Our formalisation of the Myhill-Nerode theorem consists of 780 lines of
 Isabelle/Isar code for the first direction and 460 for the second (the one
-based on tagging functions), plus around 300 lines of standard material
+based on tagging-functions), plus around 300 lines of standard material
 about regular languages. The formalisation of derivatives and partial
 derivatives shown in Section~\ref{derivatives} consists of 390 lines of
-code.  The algorithm for solving equational systems, which we used in the
+code.  The closure properties in Section~\ref{closure} can be established in
-first direction, is conceptually relatively simple. Still the use of sets
+190 lines of code.  The algorithm for solving equational systems, which we
-over which the algorithm operates means it is not as easy to formalise as
+used in the first direction, is conceptually relatively simple. Still the
-one might hope. However, it seems sets cannot be avoided since the `input'
+use of sets over which the algorithm operates means it is not as easy to
-of the algorithm consists of equivalence classes and we cannot see how to
+formalise as one might hope. However, it seems sets cannot be avoided since
-reformulate the theory so that we can use lists. Lists would be much easier
+the `input' of the algorithm consists of equivalence classes and we cannot
-to reason about, since we can define functions over them by recursion. For
+see how to reformulate the theory so that we can use lists. Lists would be
-sets we have to use set-comprehensions, which is slightly unwieldy.
+much easier to reason about, since we can define functions over them by
+recursion. For sets we have to use set-comprehensions, which is slightly
+unwieldy.
 While our formalisation might appear large, it should be seen
 in the context of the work done by Constable at al \cite{Constable00} who
 formalised the Myhill-Nerode theorem in Nuprl using automata. They write
 that their four-member team needed something on the magnitude of 18 months

changeset 200	204856ef5573
parent 199	11c3c302fa2e
child 201	9fbf6d9f85ae