regexp: comparison Paper/Paper.thy

equal deleted inserted replaced

-:ecf6c61a4541
+:8ab3a06577cf
 consts
 REL :: "(string \<times> string) \<Rightarrow> bool"
 UPLUS :: "'a set \<Rightarrow> 'a set \<Rightarrow> (nat \<times> 'a) set"
+abbreviation
+"EClass x R \<equiv> R `` {x}"
 notation (latex output)
 str_eq_rel ("\<approx>\<^bsub>_\<^esub>") and
 Seq (infixr "\<cdot>" 100) and
 Star ("_\<^bsup>\<star>\<^esup>") and
 pow ("_\<^bsup>_\<^esup>" [100, 100] 100) and
 Suc ("_+1" [100] 100) and
 quotient ("_ \<^raw:\ensuremath{\!\sslash\!}> _" [90, 90] 90) and
 REL ("\<approx>") and
 UPLUS ("_ \<^raw:\ensuremath{\uplus}> _" [90, 90] 90) and
-L ("L '(_')" [0] 101)
+L ("L '(_')" [0] 101) and
+EClass ("\<lbrakk>_\<rbrakk>\<^bsub>_\<^esub>" [100, 100] 100)
 (*>*)
 section {* Introduction *}
 text {*
 Regular languages are an important and well-understood subject in Computer
 benefits. Among them is that it is easy to convince oneself from the fact that
 regular languages are closed under complementation: one just has to exchange
 the accepting and non-accepting states in the corresponding automaton to
 obtain an automaton for the complement language.  The problem, however, lies with
 formalising such reasoning in a HOL-based theorem prover, in our case
-Isabelle/HOL. Automata consist of states and transitions. They need to be represented
+Isabelle/HOL. Automata are build up from states and transitions that
-as graphs or matrices, neither
+need to be represented as graphs or matrices, neither
 of which can be defined as inductive datatype.\footnote{In some works
 functions are used to represent state transitions, but also they are not
 inductive datatypes.} This means we have to build our own reasoning
 infrastructure for them, as neither Isabelle/HOL nor HOL4 nor HOLlight support
 them with libraries.
 Using this definition for disjoint unions means we do not have a single type for automata
 and hence will not be able to state properties about \emph{all}
 automata, since there is no type quantification available in HOL. An
 alternative, which provides us with a single type for automata, is to give every
 state node an identity, for example a natural
-number, and then be careful renaming these identities apart whenever
+number, and then be careful to rename these identities apart whenever
 connecting two automata. This results in clunky proofs
 establishing that properties are invariant under renaming. Similarly,
 connecting two automata represented as matrices results in very adhoc
 constructions, which are not pleasant to reason about.
 Because of these problems to do with representing automata, there seems
 to be no substantial formalisation of automata theory and regular languages
 carried out in a HOL-based theorem prover. We are only aware of the
-large formalisation of the automata theory in Nuprl \cite{Constable00} and
+large formalisation of automata theory in Nuprl \cite{Constable00} and
 some smaller formalisations in Coq, for example \cite{Filliatre97}.
 In this paper, we will not attempt to formalise automata theory, but take a completely
 different approach to regular languages. Instead of defining a regular language as one
 where there exists an automaton that recognises all strings of the language, we define
 \end{definition}
 \noindent
 The reason is that regular expressions, unlike graphs and matrices, can
 be easily defined as inductive datatype. Therefore a corresponding reasoning
-infrastructure comes for free. This has recently been used for formalising regular
+infrastructure comes for free. This has recently been used in HOL4 for formalising regular
-expression matching in HOL4 \cite{OwensSlind08}.  The purpose of this paper is to
+expression matching based on derivatives \cite{OwensSlind08}.  The purpose of this paper is to
 show that a central result about regular languages, the Myhill-Nerode theorem,
-can be recreated by only using regular expressions. This theorem give a necessary
+can be recreated by only using regular expressions. This theorem gives a necessary
 and sufficient condition for when a language is regular. As a corollary of this
 theorem we can easily establish the usual closure properties, including
 complementation, for regular languages.\smallskip
 \noindent
 @{thm (lhs) L_rexp.simps(6)[where r="r"]} & @{text "\<equiv>"} &
 @{thm (rhs) L_rexp.simps(6)[where r="r"]}\\
 \end{tabular}
 \end{tabular}
 \end{center}
 *}
 section {* Finite Partitions Imply Regularity of a Language *}
 text {*
+\begin{definition}[Myhill-Nerode Relation]\mbox{}\\
+@{thm str_eq_rel_def[simplified]}
+\end{definition}
+\begin{definition} @{text "finals A"} are the equivalence classes that contain
+strings from @{text A}\\
+@{thm finals_def}
+\end{definition}
+@{thm lang_is_union_of_finals}
 \begin{theorem}
 Given a language @{text A}.
-@{thm[mode=IfThen] hard_direction[where Lang="A"]}
+@{thm[mode=IfThen] hard_direction}
 \end{theorem}
 *}
 section {* Regular Expressions Generate Finitely Many Partitions *}

changeset 70	8ab3a06577cf
parent 67	7478be786f87
child 71	426070e68b21