regexp: Paper/Paper.thy@0d4d5bb321dc (annotated)

24 f72c82bf59e5 added paper urbanc parents: diff changeset	1	(<)
f72c82bf59e5 added paper urbanc parents: diff changeset	2	theory Paper
39 a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	3	imports "../Myhill" "LaTeXsugar"
24 f72c82bf59e5 added paper urbanc parents: diff changeset	4	begin
39 a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	5
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	6	declare [[show_question_marks = false]]
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	7
54 c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	8	consts
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	9	REL :: "(string \<times> string) \<Rightarrow> bool"
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	10
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	11
39 a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	12	notation (latex output)
50 32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	13	str_eq_rel ("\<approx>\<^bsub>_\<^esub>") and
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	14	Seq (infixr "\<cdot>" 100) and
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	15	Star ("_\<^bsup>\<star>\<^esup>") and
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	16	pow ("_\<^bsup>_\<^esup>" [100, 100] 100) and
58 0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	17	Suc ("_+1" [100] 100) and
54 c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	18	quotient ("_ \<^raw:\ensuremath{\!\sslash\!}> _" [90, 90] 90) and
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	19	REL ("\<approx>")
52 4a517c6ac07d tuning of the syntax; needs the stmaryrd latex package urbanc parents: 51 diff changeset	20
24 f72c82bf59e5 added paper urbanc parents: diff changeset	21	(>)
f72c82bf59e5 added paper urbanc parents: diff changeset	22
f72c82bf59e5 added paper urbanc parents: diff changeset	23	section {* Introduction *}
f72c82bf59e5 added paper urbanc parents: diff changeset	24
f72c82bf59e5 added paper urbanc parents: diff changeset	25	text {*
58 0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	26	Regular languages are an important and well-understood subject in Computer
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	27	Science with many beautiful theorems and many useful algorithms. There is a
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	28	wide range of textbooks about this subject. Many of these textbooks, such as
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	29	\cite{Kozen97}, are aimed at students and contain very detailed
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	30	``pencil-and-paper'' proofs. It seems natural to exercise theorem provers by
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	31	formalising these theorems and by verifying formally the algorithms. There
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	32	is however a problem: the typical approach to regular languages is to start
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	33	with finite automata.
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	34
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	35
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	36
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	37
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	38
54 c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	39
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	40	Therefore instead of defining a regular language as being one where there exists an
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	41	automata that regognises all of its strings, we define
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	42
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	43	\begin{definition}[A Regular Language]
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	44	A language @{text A} is regular, if there is a regular expression that matches all
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	45	strings of @{text "A"}.
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	46	\end{definition}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	47
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	48	\noindent
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	49	{\bf Contributions:} A proof of the Myhil-Nerode Theorem based on regular expressions. The
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	50	finiteness part of this theorem is proved using tagging-functions (which to our knowledge
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	51	are novel in this context).
24 f72c82bf59e5 added paper urbanc parents: diff changeset	52
f72c82bf59e5 added paper urbanc parents: diff changeset	53	*}
f72c82bf59e5 added paper urbanc parents: diff changeset	54
50 32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	55	section {* Preliminaries *}
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	56
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	57	text {*
58 0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	58	Strings in Isabelle/HOL are lists of characters and the
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	59	\emph{empty string} is the empty list, written @{term "[]"}. \emph{Languages} are sets of
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	60	strings. The language containing all strings is written in Isabelle/HOL as @{term "UNIV::string set"}.
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	61	The notation for the quotient of a language @{text A} according to a relation @{term REL} is
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	62	@{term "A // REL"}. The concatenation of two languages is written @{term "A ;; B"}; a language
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	63	raised tow the power $n$ is written @{term "A \<up> n"}. Both concepts are defined as
54 c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	64
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	65	\begin{center}
58 0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	66	@{thm Seq_def[THEN eq_reflection, where A1="A" and B1="B"]}
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	67	\hspace{7mm}
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	68	@{thm pow.simps(1)[THEN eq_reflection, where A1="A"]}
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	69	\hspace{7mm}
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	70	@{thm pow.simps(2)[THEN eq_reflection, where A1="A" and n1="n"]}
54 c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	71	\end{center}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	72
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	73	\noindent
58 0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	74	where @{text "@"} is the usual list-append operation. The Kleene-star of a language @{text A}
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	75	is defined as the union over all powers, namely @{thm Star_def}.
0d4d5bb321dc a little bit in the introduction urbanc parents: 54 diff changeset	76
54 c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	77
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	78	Regular expressions are defined as the following datatype
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	79
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	80	\begin{center}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	81	@{text r} @{text "::="}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	82	@{term NULL}\hspace{1.5mm}@{text"\|"}\hspace{1.5mm}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	83	@{term EMPTY}\hspace{1.5mm}@{text"\|"}\hspace{1.5mm}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	84	@{term "CHAR c"}\hspace{1.5mm}@{text"\|"}\hspace{1.5mm}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	85	@{term "SEQ r r"}\hspace{1.5mm}@{text"\|"}\hspace{1.5mm}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	86	@{term "ALT r r"}\hspace{1.5mm}@{text"\|"}\hspace{1.5mm}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	87	@{term "STAR r"}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	88	\end{center}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	89
51 6cfb92de4654 some tuning of the paper urbanc parents: 50 diff changeset	90	Central to our proof will be the solution of equational systems
50 32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	91	involving regular expressions. For this we will use the following ``reverse''
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	92	version of Arden's lemma.
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	93
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	94	\begin{lemma}[Reverse Arden's Lemma]\mbox{}\\
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	95	If @{thm (prem 1) ardens_revised} then
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	96	@{thm (lhs) ardens_revised} has the unique solution
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	97	@{thm (rhs) ardens_revised}.
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	98	\end{lemma}
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	99
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	100	\begin{proof}
51 6cfb92de4654 some tuning of the paper urbanc parents: 50 diff changeset	101	For the right-to-left direction we assume @{thm (rhs) ardens_revised} and show
6cfb92de4654 some tuning of the paper urbanc parents: 50 diff changeset	102	that @{thm (lhs) ardens_revised} holds. From Lemma ??? we have @{term "A\<star> = {[]} \<union> A ;; A\<star>"},
50 32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	103	which is equal to @{term "A\<star> = {[]} \<union> A\<star> ;; A"}. Adding @{text B} to both
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	104	sides gives @{term "B ;; A\<star> = B ;; ({[]} \<union> A\<star> ;; A)"}, whose right-hand side
51 6cfb92de4654 some tuning of the paper urbanc parents: 50 diff changeset	105	is equal to @{term "(B ;; A\<star>) ;; A \<union> B"}. This completes this direction.
50 32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	106
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	107	For the other direction we assume @{thm (lhs) ardens_revised}. By a simple induction
51 6cfb92de4654 some tuning of the paper urbanc parents: 50 diff changeset	108	on @{text n}, we can establish the property
50 32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	109
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	110	\begin{center}
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	111	@{text "(*)"}\hspace{5mm} @{thm (concl) ardens_helper}
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	112	\end{center}
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	113
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	114	\noindent
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	115	Using this property we can show that @{term "B ;; (A \<up> n) \<subseteq> X"} holds for
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	116	all @{text n}. From this we can infer @{term "B ;; A\<star> \<subseteq> X"} using Lemma ???.
51 6cfb92de4654 some tuning of the paper urbanc parents: 50 diff changeset	117	For the inclusion in the other direction we assume a string @{text s}
50 32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	118	with length @{text k} is element in @{text X}. Since @{thm (prem 1) ardens_revised}
51 6cfb92de4654 some tuning of the paper urbanc parents: 50 diff changeset	119	we know that @{term "s \<notin> X ;; (A \<up> Suc k)"} since its length is only @{text k}
6cfb92de4654 some tuning of the paper urbanc parents: 50 diff changeset	120	(the strings in @{term "X ;; (A \<up> Suc k)"} are all longer).
53 da85feadb8e3 small typo urbanc parents: 52 diff changeset	121	From @{text "(*)"} it follows then that
50 32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	122	@{term s} must be element in @{term "(\<Union>m\<in>{0..k}. B ;; (A \<up> m))"}. This in turn
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	123	implies that @{term s} is in @{term "(\<Union>n. B ;; (A \<up> n))"}. Using Lemma ??? this
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	124	is equal to @{term "B ;; A\<star>"}, as we needed to show.\qed
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	125	\end{proof}
32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	126	*}
39 a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	127
54 c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	128	section {* Finite Partitions Imply Regularity of a Language *}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	129
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	130	text {*
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	131	\begin{theorem}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	132	Given a language @{text A}.
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	133	@{thm[mode=IfThen] hard_direction[where Lang="A"]}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	134	\end{theorem}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	135	*}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	136
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	137	section {* Regular Expressions Generate Finitely Many Partitions *}
39 a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	138
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	139	text {*
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	140
54 c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	141	\begin{theorem}
39 a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	142	Given @{text "r"} is a regular expressions, then @{thm rexp_imp_finite}.
54 c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	143	\end{theorem}
39 a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	144
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	145	\begin{proof}
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	146	By induction on the structure of @{text r}. The cases for @{const NULL}, @{const EMPTY}
50 32bff8310071 revised proof of Ardens lemma urbanc parents: 39 diff changeset	147	and @{const CHAR} are straightforward, because we can easily establish
39 a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	148
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	149	\begin{center}
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	150	\begin{tabular}{l}
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	151	@{thm quot_null_eq}\\
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	152	@{thm quot_empty_subset}\\
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	153	@{thm quot_char_subset}
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	154	\end{tabular}
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	155	\end{center}
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	156
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	157	\end{proof}
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	158	*}
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	159
a59473f0229d tuned a little bit the section about finite partitions urbanc parents: 37 diff changeset	160
54 c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	161	section {* Conclusion and Related Work *}
c19d2fc2cc69 a bit more on the paper urbanc parents: 53 diff changeset	162
24 f72c82bf59e5 added paper urbanc parents: diff changeset	163	(<)
f72c82bf59e5 added paper urbanc parents: diff changeset	164	end
f72c82bf59e5 added paper urbanc parents: diff changeset	165	(>)

author	urbanc
	Wed, 02 Feb 2011 13:54:07 +0000
changeset 58	0d4d5bb321dc
parent 54	c19d2fc2cc69
child 59	fc35eb54fdc9
permissions	-rw-r--r--