regexp: comparison Paper/Paper.thy

equal deleted inserted replaced

-:c5f138b5fc88
+:342983676c8f
 Central to our proof will be the solution of equational systems
 involving equivalence classes of languages. For this we will use Arden's lemma \cite{Brzozowski64},
 which solves equations of the form @{term "X = A ;; X \<union> B"} provided
 @{term "[] \<notin> A"}. However we will need the following `reverse'
-version of Arden's lemma.
+version of Arden's lemma (`reverse' in the sense of changing the order of @{text "\<cdot>"}).
 \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
 If @{thm (prem 1) arden} then
 @{thm (lhs) arden} if and only if
 @{thm (rhs) arden}.
 @{thm (concl) Subst_all_satisfies_invariant}
 \end{center}
 \noindent
 Finiteness is straightforward, as @{const Subst} and @{const Arden} operations
-keep the equational system finite. These operation also preserve soundness
+keep the equational system finite. These operations also preserve soundness
 and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
 The property ardenable is clearly preserved because the append-operation
 cannot make a regular expression to match the empty string. Validity is
 given because @{const Arden} removes an equivalence class from @{text yrhs}
 and then @{const Subst_all} removes @{text Y} from the equational system.
 section {* Myhill-Nerode, Second Part *}
 text {*
+We will prove in this section the second part of the Myhill-Nerode
+theorem. It can be formulated in our setting as follows.
 \begin{theorem}
 Given @{text "r"} is a regular expressions, then @{thm Myhill_Nerode2}.
 \end{theorem}
-\begin{proof}
+\noindent
-By induction on the structure of @{text r}. The cases for @{const NULL}, @{const EMPTY}
+The proof will be by induction on the structure of @{text r}. It turns out
-and @{const CHAR} are straightforward, because we can easily establish
+the base cases are straightforward.
+\begin{proof}[Base Cases]
+The cases for @{const NULL}, @{const EMPTY} and @{const CHAR} are routine, because
+we can easily establish
 \begin{center}
 \begin{tabular}{l}
 @{thm quot_null_eq}\\
 @{thm quot_empty_subset}\\
 @{thm quot_char_subset}
 \end{tabular}
 \end{center}
+\noindent
+hold, which shows that @{term "UNIV // \<approx>(L r)"} must be finite.\qed
 \end{proof}
+\noindent
+Much more interesting are the inductive cases, which seem hard to be solved
+directly. The reader is invited to try. Our method will rely on some
+\emph{tagging} functions of strings. Given the inductive hypothesis, it will
+be easy to prove that the range of these tagging functions is finite.
 @{thm tag_str_ALT_def[where ?L1.0="A" and ?L2.0="B"]}
 @{thm tag_str_SEQ_def[where ?L1.0="A" and ?L2.0="B"]}
 partitions.  Proving the existence of such a regular expression via automata would
 be quite involved. It includes the
 steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
 "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
 regular expression.
+While regular expressions are convenient in formalisations, they have some
+limitations. One is that there seems to be no notion of a minimal regular
+expression, like there is for automata. For an implementation of a simple
+regular expression matcher, whose correctness has been formally
+established, we refer the reader to Owens and Slind \cite{OwensSlind08}.
 Our formalisation consists of ??? lines of Isabelle/Isar code for the first
 direction and ??? for the second. While this might be seen as too large to
 count as a concise proof pearl, this should be seen in the context of the
 work done by Constable at al \cite{Constable00} who formalised the
 is also where our method shines, because we can completely side-step the
 standard argument \cite{Kozen97} where automata need to be composed, which
 as stated in the Introduction is not so convenient to formalise in a
 HOL-based theorem prover.
-While regular expressions are convenient in formalisations, they have some
-limitations. One is that there seems to be no notion of a minimal regular
-expression, like there is for automata. For an implementation of a simple
-regular expression matcher, whose correctness has been formally
-established, we refer the reader to Owens and Slind \cite{OwensSlind08}.
 *}
 (*<*)
 end

changeset 116	342983676c8f
parent 115	c5f138b5fc88
child 117	22ba25b808c8