regexp: comparison Paper/Paper.thy

equal deleted inserted replaced

-:3ab755a96cef
+:08afbed1c8c7
 \noindent
 Moreover, it is not so clear how to conveniently impose a finiteness condition
 upon functions in order to represent \emph{finite} automata. The best is
 probably to resort to more advanced reasoning frameworks, such as \emph{locales}
 or \emph{type classes},
-which are \emph{not} avaiable in all HOL-based theorem provers.
+which are \emph{not} available in all HOL-based theorem provers.
 Because of these problems to do with representing automata, there seems
 to be no substantial formalisation of automata theory and regular languages
 carried out in HOL-based theorem provers. Nipkow  \cite{Nipkow98} establishes
 the link between regular expressions and automata in
 \noindent
 Using this property we can show that @{term "B ;; (A \<up> n) \<subseteq> X"} holds for
 all @{text n}. From this we can infer @{term "B ;; A\<star> \<subseteq> X"} using the definition
 of @{text "\<star>"}.
 For the inclusion in the other direction we assume a string @{text s}
-with length @{text k} is element in @{text X}. Since @{thm (prem 1) arden}
+with length @{text k} is an element in @{text X}. Since @{thm (prem 1) arden}
 we know by Prop.~\ref{langprops}@{text "(ii)"} that
 @{term "s \<notin> X ;; (A \<up> Suc k)"} since its length is only @{text k}
 (the strings in @{term "X ;; (A \<up> Suc k)"} are all longer).
 From @{text "(*)"} it follows then that
-@{term s} must be element in @{term "(\<Union>m\<in>{0..k}. B ;; (A \<up> m))"}. This in turn
+@{term s} must be an element in @{term "(\<Union>m\<in>{0..k}. B ;; (A \<up> m))"}. This in turn
 implies that @{term s} is in @{term "(\<Union>n. B ;; (A \<up> n))"}. Using Prop.~\ref{langprops}@{text "(iii)"}
 this is equal to @{term "B ;; A\<star>"}, as we needed to show.\qed
 \end{proof}
 \noindent
 @{thm (prem 3) Arden_keeps_eq}, then
 @{text "X = \<Union>\<calL> ` (Arden X rhs)"}
 \end{lemma}
 \noindent
-The \emph{substituion-operation} takes an equation
+The \emph{substitution-operation} takes an equation
 of the form @{text "X = xrhs"} and substitutes it into the right-hand side @{text rhs}.
 \begin{center}
 \begin{tabular}{rc@ {\hspace{2mm}}r@ {\hspace{1mm}}l}
 @{thm (lhs) Subst_def} & @{text "\<equiv>"}~~\mbox{} & \multicolumn{2}{@ {\hspace{-2mm}}l}{@{text "let"}}\\
 & &  \multicolumn{2}{@ {\hspace{-2mm}}l}{@{text "in"}~~@{term "rhs' \<union> append_rhs_rexp xrhs r'"}}\\
 \end{tabular}
 \end{center}
 \noindent
-We again delete first all occurrence of @{text "(X, r)"} in @{text rhs}; we then calculate
+We again delete first all occurrences of @{text "(X, r)"} in @{text rhs}; we then calculate
 the regular expression corresponding to the deleted terms; finally we append this
 regular expression to @{text "xrhs"} and union it up with @{text rhs'}. When we use
 the substitution operation we will arrange it so that @{text "xrhs"} does not contain
 any occurrence of @{text X}.
-With these two operation in place, we can define the operation that removes one equation
+With these two operations in place, we can define the operation that removes one equation
 from an equational systems @{text ES}. The operation @{const Subst_all}
 substitutes an equation @{text "X = xrhs"} throughout an equational system @{text ES};
 @{const Remove} then completely removes such an equation from @{text ES} by substituting
 it to the rest of the equational system, but first eliminating all recursive occurrences
 of @{text X} by applying @{const Arden} to @{text "xrhs"}.
 removes the equation @{text "Y = yrhs"} from the system, and therefore
 the cardinality of @{const Iter} strictly decreases.\qed
 \end{proof}
 \noindent
-This brings us to our property we want establish for @{text Solve}.
+This brings us to our property we want to establish for @{text Solve}.
 \begin{lemma}
 If @{thm (prem 1) Solve} and @{thm (prem 2) Solve} then there exists
 a @{text rhs} such that  @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"}
 \begin{lemma}\label{every_eqcl_has_reg}
 @{thm[mode=IfThen] every_eqcl_has_reg}
 \end{lemma}
 \begin{proof}
-By the preceeding Lemma, we know that there exists a @{text "rhs"} such
+By the preceding Lemma, we know that there exists a @{text "rhs"} such
 that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"},
 and that the invariant holds for this equation. That means we
 know @{text "X = \<Union>\<calL> ` rhs"}. We further know that
 this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the
 invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
 @{term "y @ z \<in> A ;; B"}, as needed in this case.
 Second, there exists a @{text "z'"} such that @{term "x @ z' \<in> A"} and @{text "z - z' \<in> B"}.
 By the assumption about @{term "tag_str_SEQ A B"} we have
 @{term "\<approx>A `` {x} = \<approx>A `` {y}"} and thus @{term "x \<approx>A y"}. Which means by the Myhill-Nerode
-relation that @{term "y @ z' \<in> A"} holds. Using @{text "z - z' \<in> B"}, we can conclude aslo in this case
+relation that @{term "y @ z' \<in> A"} holds. Using @{text "z - z' \<in> B"}, we can conclude also in this case
 with @{term "y @ z \<in> A ;; B"}. We again can complete the @{const SEQ}-case
 by setting @{text A} to @{term "L r\<^isub>1"} and @{text B} to @{term "L r\<^isub>2"}.\qed
 \end{proof}
 \noindent
 %
 \noindent
 We first need to consider the case that @{text x} is the empty string.
 From the assumption we can infer @{text y} is the empty string and
 clearly have @{text "y @ z \<in> A\<star>"}. In case @{text x} is not the empty
-string, we can devide the string @{text "x @ z"} as shown in the picture
+string, we can divide the string @{text "x @ z"} as shown in the picture
 above. By the tagging function we have
 %
 \begin{center}
 @{term "\<approx>A `` {(x - x'\<^isub>m\<^isub>a\<^isub>x)} \<in> ({\<approx>A `` {x - x'} |x'. x' < x \<and> x' \<in> A\<star>})"}
 \end{center}
 text {*
 In this paper we took the view that a regular language is one where there
 exists a regular expression that matches all of its strings. Regular
 expressions can conveniently be defined as a datatype in a HOL-based theorem
 prover. For us it was therefore interesting to find out how far we can push
-this point of view. We have establised both directions of the Myhill-Nerode
+this point of view. We have established both directions of the Myhill-Nerode
 theorem.
 %
 \begin{theorem}[The Myhill-Nerode Theorem]\mbox{}\\
 A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
 \end{theorem}
 direction and 475 for the second, plus around 300 lines of standard material about
 regular languages. While this might be seen as too large to count as a
 concise proof pearl, this should be seen in the context of the work done by
 Constable at al \cite{Constable00} who formalised the Myhill-Nerode theorem
 in Nuprl using automata. They write that their four-member team needed
-something on the magnitute of 18 months for their formalisation. The
+something on the magnitude of 18 months for their formalisation. The
 estimate for our formalisation is that we needed approximately 3 months and
 this included the time to find our proof arguments. Unlike Constable et al,
 who were able to follow the proofs from \cite{HopcroftUllman69}, we had to
 find our own arguments.  So for us the formalisation was not the
 bottleneck. It is hard to gauge the size of a formalisation in Nurpl, but
 Mercurial Repository at
 \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.
 Our proof of the first direction is very much inspired by \emph{Brzozowski's
-algebraic mehod} used to convert a finite automaton to a regular
+algebraic method} used to convert a finite automaton to a regular
 expression \cite{Brzozowski64}. The close connection can be seen by considering the equivalence
 classes as the states of the minimal automaton for the regular language.
 However there are some subtle differences. Since we identify equivalence
 classes with the states of the automaton, then the most natural choice is to
 characterise each state with the set of strings starting from the initial

changeset 134	08afbed1c8c7
parent 133	3ab755a96cef
child 135	604518f0127f