diff -r 50cc1a39c990 -r c6ebfe052109 Journal/Paper.thy --- a/Journal/Paper.thy Thu Jul 28 01:12:02 2011 +0000 +++ b/Journal/Paper.thy Thu Jul 28 11:56:25 2011 +0000 @@ -226,7 +226,7 @@ \end{center} \noindent - On `paper' or a theorem prover based on set-theory, we can define the corresponding + On `paper' we can define the corresponding graph in terms of the disjoint union of the state nodes. Unfortunately in HOL, the standard definition for disjoint union, namely @@ -241,7 +241,7 @@ single type for automata. As a result we will not be able to define a regular language as one for which there exists an automaton that recognises all its strings. This is because we cannot make a definition in HOL that is polymorphic in - the state type and also there is no type quantification available in HOL (unlike + the state type and there is no type quantification available in HOL (unlike in Coq, for example). An alternative, which provides us with a single type for automata, is to give every @@ -316,7 +316,7 @@ Isabelle/HOL nor will we attempt to formalise automata proofs from the literature, but take a different approach to regular languages than is usually taken. Instead of defining a regular language as one where there - exists an automaton that recognises all strings of the language, we define a + exists an automaton that recognises all its strings, we define a regular language as: \begin{dfntn} @@ -329,7 +329,7 @@ functions, can be easily defined as an inductive datatype. A reasoning infrastructure (like induction and recursion) comes then for free in HOL. Moreover, no side-conditions will be needed for regular expressions, - like we usually need for automata. This convenience of regular expressions has + like we need for automata. This convenience of regular expressions has recently been exploited in HOL4 with a formalisation of regular expression matching based on derivatives \cite{OwensSlind08} and with an equivalence checker for regular expressions in Isabelle/HOL \cite{KraussNipkow11}. The @@ -350,10 +350,10 @@ certain tagging-functions, and another indirect proof using Antimirov's partial derivatives \cite{Antimirov95}. Again to our best knowledge, the tagging-functions have not been used before to establish the Myhill-Nerode - theorem. Derivatives of regular expressions have been used widely in the - literature about regular expressions. However, partial derivatives are more - suitable in the context of the Myhill-Nerode theorem, since it is easier to - establish formally their finiteness result. + theorem. Derivatives of regular expressions have been recently used quite + widely in the literature about regular expressions. However, partial + derivatives are more suitable in the context of the Myhill-Nerode theorem, + since it is easier to establish formally their finiteness result. *} @@ -584,8 +584,9 @@ strings in the equivalence class @{text Y}, we obtain a subset of @{text X}. Note that we do not define an automaton here, we merely relate two sets (with the help of a character). In our concrete example we have - @{term "X\<^isub>1 \c\ X\<^isub>2"}, @{term "X\<^isub>1 \d\ X\<^isub>3"} with @{text d} being any - other character than @{text c}, and @{term "X\<^isub>3 \d\ X\<^isub>3"} for any @{text d}. + @{term "X\<^isub>1 \c\ X\<^isub>2"}, @{term "X\<^isub>1 \d\<^isub>i\ X\<^isub>3"} with @{text "d\<^isub>i"} being any + other character than @{text c}, and @{term "X\<^isub>3 \c\<^isub>j\ X\<^isub>3"} for any + caracter @{text "c\<^isub>j"}. Next we construct an \emph{initial equational system} that contains an equation for each equivalence class. We first give @@ -640,10 +641,7 @@ \noindent where @{text "d\<^isub>1\d\<^isub>n"} is the sequence of all characters except @{text c}, and @{text "c\<^isub>1\c\<^isub>m"} is the sequence of all - characters. In our initial equation systems there can only be finitely many - terms of the form @{text "(Y\<^isub>i\<^isub>j, ATOM c\<^isub>i\<^isub>j)"}, - since by assumption there are only finitely many equivalence classes and - only finitely many characters. + characters. Overloading the function @{text \} for the two kinds of terms in the equational system, we have @@ -745,10 +743,31 @@ In this definition, we first delete all terms of the form @{text "(X, r)"} from @{text rhs}; then we calculate the combined regular expressions for all @{text r} coming from the deleted @{text "(X, r)"}, and take the @{const Star} of it; - finally we append this regular expression to @{text rhs'}. It can be easily seen - that this operation mimics Arden's Lemma on the level of equations. To ensure - the non-emptiness condition of Arden's Lemma we say that a right-hand side is - @{text ardenable} provided + finally we append this regular expression to @{text rhs'}. If we apply this + operation to the right-hand side of @{text "X\<^isub>3"} in \eqref{exmpcs}, we obtain + the equation: + + \begin{center} + \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} + @{term "X\<^isub>3"} & @{text "="} & + @{text "(X\<^isub>1, TIMES (ATOM d\<^isub>1) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\,ATOM c\<^isub>m})) + \ "}\\ + & & \mbox{}\hspace{13mm} + @{text "\ + (X\<^isub>1, TIMES (ATOM d\<^isub>n) (STAR \<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\,ATOM c\<^isub>m}))"} + \end{tabular} + \end{center} + + + \noindent + That means we eliminated the dependency of @{text "X\<^isub>3"} on the + right-hand side. Note we used the abbreviation + @{text "\<^raw:\ensuremath{\bigplus}>{ATOM c\<^isub>1,\,ATOM c\<^isub>m}"} + to stand for a regular expression that matches with every character. In + our algorithm we are only interested in the existence of such a regular expresion + and not specify it any further. + + It can be easily seen that the @{text "Arden"}-operation mimics Arden's + Lemma on the level of equations. To ensure the non-emptiness condition of + Arden's Lemma we say that a right-hand side is @{text ardenable} provided \begin{center} @{thm ardenable_def} @@ -785,7 +804,15 @@ the regular expression corresponding to the deleted terms; finally we append this regular expression to @{text "xrhs"} and union it up with @{text rhs'}. When we use the substitution operation we will arrange it so that @{text "xrhs"} does not contain - any occurrence of @{text X}. + any occurrence of @{text X}. For example substituting the first equation in + \eqref{exmpcs} into the right-hand side of the second, thus eliminating the equivalence + class @{text "X\<^isub>1"}, gives us the equation + + \begin{equation}\label{exmpresult} + \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} + @{term "X\<^isub>2"} & @{text "="} & @{text "\(TIMES ONE (ATOM c))"}\\ + \end{tabular}} + \end{equation} With these two operations in place, we can define the operation that removes one equation from an equational systems @{text ES}. The operation @{const Subst_all}