Paper/Paper.thy
changeset 156 fd39492b187c
parent 154 7c68b9ad4486
child 157 10d2d0cbe381
equal deleted inserted replaced
155:d8d1e1f53d6e 156:fd39492b187c
   263   \end{tabular}
   263   \end{tabular}
   264   \end{proposition}
   264   \end{proposition}
   265 
   265 
   266   \noindent
   266   \noindent
   267   In @{text "(ii)"} we use the notation @{term "length s"} for the length of a
   267   In @{text "(ii)"} we use the notation @{term "length s"} for the length of a
   268   string; this property states that if @{term "[] \<notin> A"} then the lengths of
   268   string; this property states that if \mbox{@{term "[] \<notin> A"}} then the lengths of
   269   the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}.  We omit
   269   the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}.  We omit
   270   the proofs for these properties, but invite the reader to consult our
   270   the proofs for these properties, but invite the reader to consult our
   271   formalisation.\footnote{Available at \url{http://www4.in.tum.de/~urbanc/regexp.html}}
   271   formalisation.\footnote{Available at \url{http://www4.in.tum.de/~urbanc/regexp.html}}
   272 
   272 
   273   The notation in Isabelle/HOL for the quotient of a language @{text A} according to an 
   273   The notation in Isabelle/HOL for the quotient of a language @{text A} according to an 
   274   equivalence relation @{term REL} is @{term "A // REL"}. We will write 
   274   equivalence relation @{term REL} is @{term "A // REL"}. We will write 
   275   @{text "\<lbrakk>x\<rbrakk>\<^isub>\<approx>"} for the equivalence class defined 
   275   @{text "\<lbrakk>x\<rbrakk>\<^isub>\<approx>"} for the equivalence class defined 
   276   as @{text "{y | y \<approx> x}"}.
   276   as \mbox{@{text "{y | y \<approx> x}"}}.
   277 
   277 
   278 
   278 
   279   Central to our proof will be the solution of equational systems
   279   Central to our proof will be the solution of equational systems
   280   involving equivalence classes of languages. For this we will use Arden's lemma \cite{Brzozowski64},
   280   involving equivalence classes of languages. For this we will use Arden's Lemma \cite{Brzozowski64},
   281   which solves equations of the form @{term "X = A ;; X \<union> B"} provided
   281   which solves equations of the form @{term "X = A ;; X \<union> B"} provided
   282   @{term "[] \<notin> A"}. However we will need the following `reverse' 
   282   @{term "[] \<notin> A"}. However we will need the following `reverse' 
   283   version of Arden's lemma (`reverse' in the sense of changing the order of @{term "A ;; X"} to
   283   version of Arden's Lemma (`reverse' in the sense of changing the order of @{term "A ;; X"} to
   284   \mbox{@{term "X ;; A"}}).
   284   \mbox{@{term "X ;; A"}}).
   285 
   285 
   286   \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
   286   \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
   287   If @{thm (prem 1) arden} then
   287   If @{thm (prem 1) arden} then
   288   @{thm (lhs) arden} if and only if
   288   @{thm (lhs) arden} if and only if
   444   @{text X}. Note that we do not define an automaton here, we merely relate two sets
   444   @{text X}. Note that we do not define an automaton here, we merely relate two sets
   445   (with the help of a character). In our concrete example we have 
   445   (with the help of a character). In our concrete example we have 
   446   @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<Rightarrow> X\<^isub>3"} with @{text d} being any 
   446   @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<Rightarrow> X\<^isub>3"} with @{text d} being any 
   447   other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>d\<Rightarrow> X\<^isub>3"} for any @{text d}.
   447   other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>d\<Rightarrow> X\<^isub>3"} for any @{text d}.
   448   
   448   
   449   Next we build an \emph{initial equational system} that
   449   Next we construct an \emph{initial equational system} that
   450   contains an equation for each equivalence class. Suppose we have 
   450   contains an equation for each equivalence class. We first give
       
   451   an informal description of this construction. Suppose we have 
   451   the equivalence classes @{text "X\<^isub>1,\<dots>,X\<^isub>n"}, there must be one and only one that
   452   the equivalence classes @{text "X\<^isub>1,\<dots>,X\<^isub>n"}, there must be one and only one that
   452   contains the empty string @{text "[]"} (since equivalence classes are disjoint).
   453   contains the empty string @{text "[]"} (since equivalence classes are disjoint).
   453   Let us assume @{text "[] \<in> X\<^isub>1"}. We build the following equational system
   454   Let us assume @{text "[] \<in> X\<^isub>1"}. We build the following equational system
   454   
   455   
   455   \begin{center}
   456   \begin{center}
   462   \end{center}
   463   \end{center}
   463 
   464 
   464   \noindent
   465   \noindent
   465   where the terms @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"}
   466   where the terms @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"}
   466   stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow>
   467   stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow>
   467   X\<^isub>i"}.   There can only be
   468   X\<^isub>i"}. In terms of finite automata, every equation @{text "X\<^isub>i = rhs\<^isub>i"} in this system
   468   finitely many such terms in a right-hand side since by assumption there are only finitely many
   469   corresponds very roughly to a state whose name is @{text X\<^isub>i} and its predecessor states 
   469   equivalence classes and only finitely many characters.  The term @{text
   470   are the @{text "Y\<^isub>i\<^isub>j"}; the @{text "c\<^isub>i\<^isub>j"} are the labels of the transitions from these 
       
   471   predecessor states to @{text X\<^isub>i}. In our initial equation system there can only be
       
   472   finitely many terms of the form @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} in a right-hand side 
       
   473   since by assumption there are only finitely many
       
   474   equivalence classes and only finitely many characters. The term @{text
   470   "\<lambda>(EMPTY)"} in the first equation acts as a marker for the equivalence class
   475   "\<lambda>(EMPTY)"} in the first equation acts as a marker for the equivalence class
   471   containing @{text "[]"}.\footnote{Note that we mark, roughly speaking, the
   476   containing @{text "[]"}.\footnote{Note that we mark, roughly speaking, the
   472   single `initial' state in the equational system, which is different from
   477   single `initial' state in the equational system, which is different from
   473   the method by Brzozowski \cite{Brzozowski64}, where he marks the
   478   the method by Brzozowski \cite{Brzozowski64}, where he marks the
   474   `terminal' states. We are forced to set up the equational system in our
   479   `terminal' states. We are forced to set up the equational system in our
   475   way, because the Myhill-Nerode relation determines the `direction' of the
   480   way, because the Myhill-Nerode relation determines the `direction' of the
   476   transitions---the successor `state' of an equivalence class @{text Y} can
   481   transitions---the successor `state' of an equivalence class @{text Y} can
   477   be reached by adding a character to the end of @{text Y}. This is also the
   482   be reached by adding a character to the end of @{text Y}. This is also the
   478   reason why we have to use our reverse version of Arden's lemma.}
   483   reason why we have to use our reverse version of Arden's Lemma.}
   479   Overloading the function @{text \<calL>} for the two kinds of terms in the
   484   Overloading the function @{text \<calL>} for the two kinds of terms in the
   480   equational system, we have
   485   equational system, we have
   481   
   486   
   482   \begin{center}
   487   \begin{center}
   483   @{text "\<calL>(Y, r) \<equiv>"} %
   488   @{text "\<calL>(Y, r) \<equiv>"} %
   569   \noindent
   574   \noindent
   570   In this definition, we first delete all terms of the form @{text "(X, r)"} from @{text rhs};
   575   In this definition, we first delete all terms of the form @{text "(X, r)"} from @{text rhs};
   571   then we calculate the combined regular expressions for all @{text r} coming 
   576   then we calculate the combined regular expressions for all @{text r} coming 
   572   from the deleted @{text "(X, r)"}, and take the @{const STAR} of it;
   577   from the deleted @{text "(X, r)"}, and take the @{const STAR} of it;
   573   finally we append this regular expression to @{text rhs'}. It can be easily seen 
   578   finally we append this regular expression to @{text rhs'}. It can be easily seen 
   574   that this operation mimics Arden's lemma on the level of equations. To ensure
   579   that this operation mimics Arden's Lemma on the level of equations. To ensure
   575   the non-emptiness condition of Arden's lemma we say that a right-hand side is
   580   the non-emptiness condition of Arden's Lemma we say that a right-hand side is
   576   @{text ardenable} provided
   581   @{text ardenable} provided
   577 
   582 
   578   \begin{center}
   583   \begin{center}
   579   @{thm ardenable_def}
   584   @{thm ardenable_def}
   580   \end{center}
   585   \end{center}
   581 
   586 
   582   \noindent
   587   \noindent
   583   This allows us to prove a version of Arden's lemma on the level of equations.
   588   This allows us to prove a version of Arden's Lemma on the level of equations.
   584 
   589 
   585   \begin{lemma}\label{ardenable}
   590   \begin{lemma}\label{ardenable}
   586   Given an equation @{text "X = rhs"}.
   591   Given an equation @{text "X = rhs"}.
   587   If @{text "X = \<Union>\<calL> ` rhs"},
   592   If @{text "X = \<Union>\<calL> ` rhs"},
   588   @{thm (prem 2) Arden_keeps_eq}, and
   593   @{thm (prem 2) Arden_keeps_eq}, and
   589   @{thm (prem 3) Arden_keeps_eq}, then
   594   @{thm (prem 3) Arden_keeps_eq}, then
   590   @{text "X = \<Union>\<calL> ` (Arden X rhs)"}.
   595   @{text "X = \<Union>\<calL> ` (Arden X rhs)"}.
   591   \end{lemma}
   596   \end{lemma}
   592   
   597   
   593   \noindent
   598   \noindent
   594   The \emph{substitution-operation} takes an equation
   599   Our @{text ardenable} condition is slightly stronger than needed for applying Arden's Lemma,
       
   600   but we can still ensure that it holds troughout our algorithm of transforming equations
       
   601   into solved form. The \emph{substitution-operation} takes an equation
   595   of the form @{text "X = xrhs"} and substitutes it into the right-hand side @{text rhs}.
   602   of the form @{text "X = xrhs"} and substitutes it into the right-hand side @{text rhs}.
   596 
   603 
   597   \begin{center}
   604   \begin{center}
   598   \begin{tabular}{rc@ {\hspace{2mm}}r@ {\hspace{1mm}}l}
   605   \begin{tabular}{rc@ {\hspace{2mm}}r@ {\hspace{1mm}}l}
   599   @{thm (lhs) Subst_def} & @{text "\<equiv>"}~~\mbox{} & \multicolumn{2}{@ {\hspace{-2mm}}l}{@{text "let"}}\\ 
   606   @{thm (lhs) Subst_def} & @{text "\<equiv>"}~~\mbox{} & \multicolumn{2}{@ {\hspace{-2mm}}l}{@{text "let"}}\\ 
   708   The first two ensure that the equational system is always finite (number of equations
   715   The first two ensure that the equational system is always finite (number of equations
   709   and number of terms in each equation); the second makes sure the `meaning' of the 
   716   and number of terms in each equation); the second makes sure the `meaning' of the 
   710   equations is preserved under our transformations. The other properties are a bit more
   717   equations is preserved under our transformations. The other properties are a bit more
   711   technical, but are needed to get our proof through. Distinctness states that every
   718   technical, but are needed to get our proof through. Distinctness states that every
   712   equation in the system is distinct. @{text Ardenable} ensures that we can always
   719   equation in the system is distinct. @{text Ardenable} ensures that we can always
   713   apply the arden operation. 
   720   apply the @{text Arden} operation. 
   714   The last property states that every @{text rhs} can only contain equivalence classes
   721   The last property states that every @{text rhs} can only contain equivalence classes
   715   for which there is an equation. Therefore @{text lhss} is just the set containing 
   722   for which there is an equation. Therefore @{text lhss} is just the set containing 
   716   the first components of an equational system,
   723   the first components of an equational system,
   717   while @{text "rhss"} collects all equivalence classes @{text X} in the terms of the 
   724   while @{text "rhss"} collects all equivalence classes @{text X} in the terms of the 
   718   form @{term "Trn X r"}. That means formally @{thm (lhs) lhss_def}~@{text "\<equiv> {X | (X, rhs) \<in> ES}"} 
   725   form @{term "Trn X r"}. That means formally @{thm (lhs) lhss_def}~@{text "\<equiv> {X | (X, rhs) \<in> ES}"} 
   740   \begin{lemma}\label{iterone}
   747   \begin{lemma}\label{iterone}
   741   @{thm[mode=IfThen] iteration_step_invariant[where xrhs="rhs"]}
   748   @{thm[mode=IfThen] iteration_step_invariant[where xrhs="rhs"]}
   742   \end{lemma}
   749   \end{lemma}
   743 
   750 
   744   \begin{proof} 
   751   \begin{proof} 
   745   This boils down to choosing an equation @{text "Y = yrhs"} to be eliminated
   752   The argument boils down to choosing an equation @{text "Y = yrhs"} to be eliminated
   746   and to show that @{term "Subst_all (ES - {(Y, yrhs)}) Y (Arden Y yrhs)"} 
   753   and to show that @{term "Subst_all (ES - {(Y, yrhs)}) Y (Arden Y yrhs)"} 
   747   preserves the invariant.
   754   preserves the invariant.
   748   We prove this as follows:
   755   We prove this as follows:
   749 
   756 
   750   \begin{center}
   757   \begin{center}
   751   @{text "\<forall> ES."} @{thm (prem 1) Subst_all_satisfies_invariant} implies
   758   @{text "\<forall> ES."} @{thm (prem 1) Subst_all_satisfies_invariant} implies
   752   @{thm (concl) Subst_all_satisfies_invariant}
   759   @{thm (concl) Subst_all_satisfies_invariant}
   753   \end{center}
   760   \end{center}
   754 
   761 
   755   \noindent
   762   \noindent
   756   Finiteness is straightforward, as @{const Subst} and @{const Arden} operations 
   763   Finiteness is straightforward, as the @{const Subst} and @{const Arden} operations 
   757   keep the equational system finite. These operations also preserve soundness 
   764   keep the equational system finite. These operations also preserve soundness 
   758   and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
   765   and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
   759   The property @{text ardenable} is clearly preserved because the append-operation
   766   The property @{text ardenable} is clearly preserved because the append-operation
   760   cannot make a regular expression to match the empty string. Validity is
   767   cannot make a regular expression to match the empty string. Validity is
   761   given because @{const Arden} removes an equivalence class from @{text yrhs}
   768   given because @{const Arden} removes an equivalence class from @{text yrhs}
   827   that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"},
   834   that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"},
   828   and that the invariant holds for this equation. That means we 
   835   and that the invariant holds for this equation. That means we 
   829   know @{text "X = \<Union>\<calL> ` rhs"}. We further know that
   836   know @{text "X = \<Union>\<calL> ` rhs"}. We further know that
   830   this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the 
   837   this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the 
   831   invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
   838   invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
   832   we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the arden operation
   839   we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the @{text Arden} operation
   833   removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
   840   removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
   834   This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
   841   This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
   835   So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = L (\<Uplus>rs)"}.
   842   So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = L (\<Uplus>rs)"}.
   836   With this we can conclude the proof.\qed
   843   With this we can conclude the proof.\qed
   837   \end{proof}
   844   \end{proof}
  1102                node[midway, below=0.5em]{@{text "(z - z') \<in> B"}};
  1109                node[midway, below=0.5em]{@{text "(z - z') \<in> B"}};
  1103   \end{tikzpicture}}
  1110   \end{tikzpicture}}
  1104   \end{center}
  1111   \end{center}
  1105   %
  1112   %
  1106   \noindent
  1113   \noindent
  1107   Either there is a prefix of @{text x} in @{text A} and the rest is in @{text B},
  1114   Either there is a prefix of @{text x} in @{text A} and the rest is in @{text B} (first picture),
  1108   or @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B}.
  1115   or @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} (second picture).
  1109   In both cases we have to show that @{term "y @ z \<in> A ;; B"}. For this we use the 
  1116   In both cases we have to show that @{term "y @ z \<in> A ;; B"}. For this we use the 
  1110   following tagging-function
  1117   following tagging-function
  1111   %
  1118   %
  1112   \begin{center}
  1119   \begin{center}
  1113   @{thm tag_str_SEQ_def[where ?L1.0="A" and ?L2.0="B", THEN meta_eq_app]}
  1120   @{thm tag_str_SEQ_def[where ?L1.0="A" and ?L2.0="B", THEN meta_eq_app]}
  1345   state leading up to that state. Usually, however, the states are characterised as the
  1352   state leading up to that state. Usually, however, the states are characterised as the
  1346   strings starting from that state leading to the terminal states.  The first
  1353   strings starting from that state leading to the terminal states.  The first
  1347   choice has consequences about how the initial equational system is set up. We have
  1354   choice has consequences about how the initial equational system is set up. We have
  1348   the $\lambda$-term on our `initial state', while Brzozowski has it on the
  1355   the $\lambda$-term on our `initial state', while Brzozowski has it on the
  1349   terminal states. This means we also need to reverse the direction of Arden's
  1356   terminal states. This means we also need to reverse the direction of Arden's
  1350   lemma.
  1357   Lemma.
  1351 
  1358 
  1352   We briefly considered using the method Brzozowski presented in the Appendix
  1359   We briefly considered using the method Brzozowski presented in the Appendix
  1353   of~\cite{Brzozowski64} in order to prove the second direction of the
  1360   of~\cite{Brzozowski64} in order to prove the second direction of the
  1354   Myhill-Nerode theorem. There he calculates the derivatives for regular
  1361   Myhill-Nerode theorem. There he calculates the derivatives for regular
  1355   expressions and shows that there can be only finitely many of them (if regarded equal 
  1362   expressions and shows that there can be only finitely many of them with 
  1356   modulo ACI). We could
  1363   respect to a language (if regarded equal modulo ACI). We could
  1357   have used as the tag of a string @{text s} the derivative of a regular expression
  1364   have used as the tag of a string @{text s} the set of derivatives of a regular expression
  1358   generated with respect to @{text s}.  Using the fact that two strings are
  1365   generated by a language.  Using the fact that two strings are
  1359   Myhill-Nerode related whenever their derivative is the same, together with
  1366   Myhill-Nerode related whenever their derivative is the same, together with
  1360   the fact that there are only finitely many derivatives for a regular
  1367   the fact that there are only finitely such derivatives
  1361   expression would give us a similar argument as ours. However it seems not so easy to
  1368   would give us a similar argument as ours. However it seems not so easy to
  1362   calculate the derivatives and then to count them. Therefore we preferred our
  1369   calculate the set of derivatives modulo ACI and then to count them. Therefore we preferred our
  1363   direct method of using tagging-functions. This
  1370   direct method of using tagging-functions. This
  1364   is also where our method shines, because we can completely side-step the
  1371   is also where our method shines, because we can completely side-step the
  1365   standard argument \cite{Kozen97} where automata need to be composed, which
  1372   standard argument \cite{Kozen97} where automata need to be composed, which
  1366   as stated in the Introduction is not so convenient to formalise in a 
  1373   as stated in the Introduction is not so convenient to formalise in a 
  1367   HOL-based theorem prover. However, it is also the direction where we had to 
  1374   HOL-based theorem prover. However, it is also the direction where we had to