regexp: comparison Journal/Paper.thy

equal deleted inserted replaced

-:edc642266a82
+:6969de1eb96b
 Isabelle/HOL as @{term "UNIV::string set"}. The concatenation of two languages
 is written @{term "A \<cdot> B"} and a language raised to the power @{text n} is written
 @{term "A \<up> n"}. They are defined as usual
 \begin{center}
 @{thm conc_def'[THEN eq_reflection, where A1="A" and B1="B"]}
 \hspace{7mm}
 @{thm lang_pow.simps(1)[THEN eq_reflection, where A1="A"]}
 \hspace{7mm}
 @{thm lang_pow.simps(2)[THEN eq_reflection, where A1="A" and n1="n"]}
 @{text "\<lbrakk>x\<rbrakk>\<^isub>\<approx>"} for the equivalence class defined
 as \mbox{@{text "{y | y \<approx> x}"}}.
 Central to our proof will be the solution of equational systems
-involving equivalence classes of languages. For this we will use Arden's Lemma \cite{Brzozowski64},
+involving equivalence classes of languages. For this we will use Arden's Lemma
+(see \cite[Page 100]{Sakarovitch09}),
 which solves equations of the form @{term "X = A \<cdot> X \<union> B"} provided
 @{term "[] \<notin> A"}. However we will need the following `reverse'
 version of Arden's Lemma (`reverse' in the sense of changing the order of @{term "A \<cdot> X"} to
 \mbox{@{term "X \<cdot> A"}}).
 \noindent
 Regular expressions are defined as the inductive datatype
 \begin{center}
-@{text r} @{text "::="}
+\begin{tabular}{rcl}
-@{term ZERO}\hspace{1.5mm}@{text"|"}\hspace{1.5mm}
+@{text r} & @{text "::="} & @{term ZERO}\\
-@{term ONE}\hspace{1.5mm}@{text"|"}\hspace{1.5mm}
+& @{text"|"} & @{term ONE}\\
-@{term "ATOM c"}\hspace{1.5mm}@{text"|"}\hspace{1.5mm}
+& @{text"|"} & @{term "ATOM c"}\\
-@{term "TIMES r r"}\hspace{1.5mm}@{text"|"}\hspace{1.5mm}
+& @{text"|"} & @{term "TIMES r r"}\\
-@{term "PLUS r r"}\hspace{1.5mm}@{text"|"}\hspace{1.5mm}
+& @{text"|"} & @{term "PLUS r r"}\\
-@{term "STAR r"}
+& @{text"|"} & @{term "STAR r"}
+\end{tabular}
 \end{center}
 \noindent
 and the language matched by a regular expression is defined as
 \begin{center}
-\begin{tabular}{c@ {\hspace{10mm}}c}
+\begin{tabular}{r@ {\hspace{2mm}}c@ {\hspace{2mm}}l}
-\begin{tabular}{rcl}
 @{thm (lhs) lang.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) lang.simps(1)}\\
 @{thm (lhs) lang.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) lang.simps(2)}\\
 @{thm (lhs) lang.simps(3)[where a="c"]} & @{text "\<equiv>"} & @{thm (rhs) lang.simps(3)[where a="c"]}\\
-\end{tabular}
-&
-\begin{tabular}{rcl}
 @{thm (lhs) lang.simps(4)[where ?r="r\<^isub>1" and ?s="r\<^isub>2"]} & @{text "\<equiv>"} &
 @{thm (rhs) lang.simps(4)[where ?r="r\<^isub>1" and ?s="r\<^isub>2"]}\\
 @{thm (lhs) lang.simps(5)[where ?r="r\<^isub>1" and ?s="r\<^isub>2"]} & @{text "\<equiv>"} &
 @{thm (rhs) lang.simps(5)[where ?r="r\<^isub>1" and ?s="r\<^isub>2"]}\\
 @{thm (lhs) lang.simps(6)[where r="r"]} & @{text "\<equiv>"} &
 @{thm (rhs) lang.simps(6)[where r="r"]}\\
-\end{tabular}
 \end{tabular}
 \end{center}
 Given a finite set of regular expressions @{text rs}, we will make use of the operation of generating
 a regular expression that matches the union of all languages of @{text rs}. We only need to know the
 the string @{text "[c]"}. The relation @{term "\<approx>({[c]})"} partitions @{text UNIV}
 into three equivalence classes @{text "X\<^isub>1"}, @{text "X\<^isub>2"} and  @{text "X\<^isub>3"}
 as follows
 \begin{center}
-@{text "X\<^isub>1 = {[]}"}\hspace{5mm}
+\begin{tabular}{l}
-@{text "X\<^isub>2 = {[c]}"}\hspace{5mm}
+@{text "X\<^isub>1 = {[]}"}\\
+@{text "X\<^isub>2 = {[c]}"}\\
 @{text "X\<^isub>3 = UNIV - {[], [c]}"}
+\end{tabular}
 \end{center}
 One direction of the Myhill-Nerode theorem establishes
 that if there are finitely many equivalence classes, like in the example above, then
 the language is regular. In our setting we therefore have to show:
 this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the
 invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
 we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation
 removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
 This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
-So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = L (\<Uplus>rs)"}.
+So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = lang (\<Uplus>rs)"}.
 With this we can conclude the proof.
 \end{proof}
 \noindent
 Lem.~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
 every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is
 a subset of  @{term "UNIV // \<approx>A"}, we also know that for every equivalence class
 in @{term "finals A"} there exists a regular expression. Moreover by assumption
 we know that @{term "finals A"} must be finite, and therefore there must be a finite
 set of regular expressions @{text "rs"} such that
-@{term "\<Union>(finals A) = L (\<Uplus>rs)"}.
+@{term "\<Union>(finals A) = lang (\<Uplus>rs)"}.
 Since the left-hand side is equal to @{text A}, we can use @{term "\<Uplus>rs"}
 as the regular expression that is needed in the theorem.
 \end{proof}
 *}

changeset 176	6969de1eb96b
parent 175	edc642266a82
child 177	50cc1a39c990