Paper/Paper.thy
changeset 154 7c68b9ad4486
parent 149 e122cb146ecc
child 156 fd39492b187c
equal deleted inserted replaced
153:e7e4e490326b 154:7c68b9ad4486
    67   the accepting and non-accepting states in the corresponding automaton to
    67   the accepting and non-accepting states in the corresponding automaton to
    68   obtain an automaton for the complement language.  The problem, however, lies with
    68   obtain an automaton for the complement language.  The problem, however, lies with
    69   formalising such reasoning in a HOL-based theorem prover, in our case
    69   formalising such reasoning in a HOL-based theorem prover, in our case
    70   Isabelle/HOL. Automata are built up from states and transitions that 
    70   Isabelle/HOL. Automata are built up from states and transitions that 
    71   need to be represented as graphs, matrices or functions, none
    71   need to be represented as graphs, matrices or functions, none
    72   of which can be defined as inductive datatype. 
    72   of which can be defined as an inductive datatype. 
    73 
    73 
    74   In case of graphs and matrices, this means we have to build our own
    74   In case of graphs and matrices, this means we have to build our own
    75   reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
    75   reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
    76   HOLlight support them with libraries. Even worse, reasoning about graphs and
    76   HOLlight support them with libraries. Even worse, reasoning about graphs and
    77   matrices can be a real hassle in HOL-based theorem provers.  Consider for
    77   matrices can be a real hassle in HOL-based theorem provers.  Consider for
   143 
   143 
   144   \noindent
   144   \noindent
   145   changes the type---the disjoint union is not a set, but a set of pairs. 
   145   changes the type---the disjoint union is not a set, but a set of pairs. 
   146   Using this definition for disjoint union means we do not have a single type for automata
   146   Using this definition for disjoint union means we do not have a single type for automata
   147   and hence will not be able to state certain properties about \emph{all}
   147   and hence will not be able to state certain properties about \emph{all}
   148   automata, since there is no type quantification available in HOL. An
   148   automata, since there is no type quantification available in HOL (unlike in Coq, for example). An
   149   alternative, which provides us with a single type for automata, is to give every 
   149   alternative, which provides us with a single type for automata, is to give every 
   150   state node an identity, for example a natural
   150   state node an identity, for example a natural
   151   number, and then be careful to rename these identities apart whenever
   151   number, and then be careful to rename these identities apart whenever
   152   connecting two automata. This results in clunky proofs
   152   connecting two automata. This results in clunky proofs
   153   establishing that properties are invariant under renaming. Similarly,
   153   establishing that properties are invariant under renaming. Similarly,
   375 
   375 
   376 text {*
   376 text {*
   377   The key definition in the Myhill-Nerode theorem is the
   377   The key definition in the Myhill-Nerode theorem is the
   378   \emph{Myhill-Nerode relation}, which states that w.r.t.~a language two 
   378   \emph{Myhill-Nerode relation}, which states that w.r.t.~a language two 
   379   strings are related, provided there is no distinguishing extension in this
   379   strings are related, provided there is no distinguishing extension in this
   380   language. This can be defined as tertiary relation.
   380   language. This can be defined as a tertiary relation.
   381 
   381 
   382   \begin{definition}[Myhill-Nerode Relation] Given a language @{text A}, two strings @{text x} and
   382   \begin{definition}[Myhill-Nerode Relation] Given a language @{text A}, two strings @{text x} and
   383   @{text y} are Myhill-Nerode related provided
   383   @{text y} are Myhill-Nerode related provided
   384   \begin{center}
   384   \begin{center}
   385   @{thm str_eq_def[simplified str_eq_rel_def Pair_Collect]}
   385   @{thm str_eq_def[simplified str_eq_rel_def Pair_Collect]}
   571   then we calculate the combined regular expressions for all @{text r} coming 
   571   then we calculate the combined regular expressions for all @{text r} coming 
   572   from the deleted @{text "(X, r)"}, and take the @{const STAR} of it;
   572   from the deleted @{text "(X, r)"}, and take the @{const STAR} of it;
   573   finally we append this regular expression to @{text rhs'}. It can be easily seen 
   573   finally we append this regular expression to @{text rhs'}. It can be easily seen 
   574   that this operation mimics Arden's lemma on the level of equations. To ensure
   574   that this operation mimics Arden's lemma on the level of equations. To ensure
   575   the non-emptiness condition of Arden's lemma we say that a right-hand side is
   575   the non-emptiness condition of Arden's lemma we say that a right-hand side is
   576   \emph{ardenable} provided
   576   @{text ardenable} provided
   577 
   577 
   578   \begin{center}
   578   \begin{center}
   579   @{thm ardenable_def}
   579   @{thm ardenable_def}
   580   \end{center}
   580   \end{center}
   581 
   581 
   625   \end{center}
   625   \end{center}
   626 
   626 
   627   \noindent
   627   \noindent
   628   Finally, we can define how an equational system should be solved. For this 
   628   Finally, we can define how an equational system should be solved. For this 
   629   we will need to iterate the process of eliminating equations until only one equation
   629   we will need to iterate the process of eliminating equations until only one equation
   630   will be left in the system. However, we not just want to have any equation
   630   will be left in the system. However, we do not just want to have any equation
   631   as being the last one, but the one involving the equivalence class for 
   631   as being the last one, but the one involving the equivalence class for 
   632   which we want to calculate the regular 
   632   which we want to calculate the regular 
   633   expression. Let us suppose this equivalence class is @{text X}. 
   633   expression. Let us suppose this equivalence class is @{text X}. 
   634   Since @{text X} is the one to be solved, in every iteration step we have to pick an
   634   Since @{text X} is the one to be solved, in every iteration step we have to pick an
   635   equation to be eliminated that is different from @{text X}. In this way 
   635   equation to be eliminated that is different from @{text X}. In this way 
   680   This principle states that given an invariant (which we will specify below) 
   680   This principle states that given an invariant (which we will specify below) 
   681   we can prove a property
   681   we can prove a property
   682   @{text "P"} involving @{const Solve}. For this we have to discharge the following
   682   @{text "P"} involving @{const Solve}. For this we have to discharge the following
   683   proof obligations: first the
   683   proof obligations: first the
   684   initial equational system satisfies the invariant; second the iteration
   684   initial equational system satisfies the invariant; second the iteration
   685   step @{text "Iter"} preserves the the invariant as long as the condition @{term Cond} holds;
   685   step @{text "Iter"} preserves the invariant as long as the condition @{term Cond} holds;
   686   third @{text "Iter"} decreases the termination order, and fourth that
   686   third @{text "Iter"} decreases the termination order, and fourth that
   687   once the condition does not hold anymore then the property @{text P} must hold.
   687   once the condition does not hold anymore then the property @{text P} must hold.
   688 
   688 
   689   The property @{term P} in our proof will state that @{term "Solve X (Init (UNIV // \<approx>A))"}
   689   The property @{term P} in our proof will state that @{term "Solve X (Init (UNIV // \<approx>A))"}
   690   returns with a single equation @{text "X = xrhs"} for some @{text "xrhs"}, and
   690   returns with a single equation @{text "X = xrhs"} for some @{text "xrhs"}, and
   707   \noindent
   707   \noindent
   708   The first two ensure that the equational system is always finite (number of equations
   708   The first two ensure that the equational system is always finite (number of equations
   709   and number of terms in each equation); the second makes sure the `meaning' of the 
   709   and number of terms in each equation); the second makes sure the `meaning' of the 
   710   equations is preserved under our transformations. The other properties are a bit more
   710   equations is preserved under our transformations. The other properties are a bit more
   711   technical, but are needed to get our proof through. Distinctness states that every
   711   technical, but are needed to get our proof through. Distinctness states that every
   712   equation in the system is distinct. Ardenable ensures that we can always
   712   equation in the system is distinct. @{text Ardenable} ensures that we can always
   713   apply the arden operation. 
   713   apply the arden operation. 
   714   The last property states that every @{text rhs} can only contain equivalence classes
   714   The last property states that every @{text rhs} can only contain equivalence classes
   715   for which there is an equation. Therefore @{text lhss} is just the set containing 
   715   for which there is an equation. Therefore @{text lhss} is just the set containing 
   716   the first components of an equational system,
   716   the first components of an equational system,
   717   while @{text "rhss"} collects all equivalence classes @{text X} in the terms of the 
   717   while @{text "rhss"} collects all equivalence classes @{text X} in the terms of the 
   727   \end{lemma}
   727   \end{lemma}
   728 
   728 
   729   \begin{proof}
   729   \begin{proof}
   730   Finiteness is given by the assumption and the way how we set up the 
   730   Finiteness is given by the assumption and the way how we set up the 
   731   initial equational system. Soundness is proved in Lem.~\ref{inv}. Distinctness
   731   initial equational system. Soundness is proved in Lem.~\ref{inv}. Distinctness
   732   follows from the fact that the equivalence classes are disjoint. The ardenable
   732   follows from the fact that the equivalence classes are disjoint. The @{text ardenable}
   733   property also follows from the setup of the initial equational system, as does 
   733   property also follows from the setup of the initial equational system, as does 
   734   validity.\qed
   734   validity.\qed
   735   \end{proof}
   735   \end{proof}
   736 
   736 
   737   \noindent
   737   \noindent
   754 
   754 
   755   \noindent
   755   \noindent
   756   Finiteness is straightforward, as @{const Subst} and @{const Arden} operations 
   756   Finiteness is straightforward, as @{const Subst} and @{const Arden} operations 
   757   keep the equational system finite. These operations also preserve soundness 
   757   keep the equational system finite. These operations also preserve soundness 
   758   and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
   758   and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
   759   The property ardenable is clearly preserved because the append-operation
   759   The property @{text ardenable} is clearly preserved because the append-operation
   760   cannot make a regular expression to match the empty string. Validity is
   760   cannot make a regular expression to match the empty string. Validity is
   761   given because @{const Arden} removes an equivalence class from @{text yrhs}
   761   given because @{const Arden} removes an equivalence class from @{text yrhs}
   762   and then @{const Subst_all} removes @{text Y} from the equational system.
   762   and then @{const Subst_all} removes @{text Y} from the equational system.
   763   Having proved the implication above, we can instantiate @{text "ES"} with @{text "ES - {(Y, yrhs)}"}
   763   Having proved the implication above, we can instantiate @{text "ES"} with @{text "ES - {(Y, yrhs)}"}
   764   which matches with our proof-obligation of @{const "Subst_all"}. Since
   764   which matches with our proof-obligation of @{const "Subst_all"}. Since
   830   this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the 
   830   this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties of the 
   831   invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
   831   invariant and Lem.~\ref{ardenable}. Using the validity property for the equation @{text "X = rhs"},
   832   we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the arden operation
   832   we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the arden operation
   833   removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
   833   removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}.
   834   This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
   834   This means the right-hand side @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}.
   835   So we can collect those (finitely many) regular expressions and have @{term "X = L (\<Uplus>rs)"}.
   835   So we can collect those (finitely many) regular expressions @{text rs} and have @{term "X = L (\<Uplus>rs)"}.
   836   With this we can conclude the proof.\qed
   836   With this we can conclude the proof.\qed
   837   \end{proof}
   837   \end{proof}
   838 
   838 
   839   \noindent
   839   \noindent
   840   Lem.~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
   840   Lem.~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction
   891   \noindent
   891   \noindent
   892   hold, which shows that @{term "UNIV // \<approx>(L r)"} must be finite.\qed
   892   hold, which shows that @{term "UNIV // \<approx>(L r)"} must be finite.\qed
   893   \end{proof}
   893   \end{proof}
   894 
   894 
   895   \noindent
   895   \noindent
   896   Much more interesting, however, are the inductive cases. They seem hard to be solved 
   896   Much more interesting, however, are the inductive cases. They seem hard to solve 
   897   directly. The reader is invited to try. 
   897   directly. The reader is invited to try. 
   898 
   898 
   899   Our proof will rely on some
   899   Our proof will rely on some
   900   \emph{tagging-functions} defined over strings. Given the inductive hypothesis, it will 
   900   \emph{tagging-functions} defined over strings. Given the inductive hypothesis, it will 
   901   be easy to prove that the \emph{range} of these tagging-functions is finite
   901   be easy to prove that the \emph{range} of these tagging-functions is finite
  1272 text {*
  1272 text {*
  1273   In this paper we took the view that a regular language is one where there
  1273   In this paper we took the view that a regular language is one where there
  1274   exists a regular expression that matches all of its strings. Regular
  1274   exists a regular expression that matches all of its strings. Regular
  1275   expressions can conveniently be defined as a datatype in HOL-based theorem
  1275   expressions can conveniently be defined as a datatype in HOL-based theorem
  1276   provers. For us it was therefore interesting to find out how far we can push
  1276   provers. For us it was therefore interesting to find out how far we can push
  1277   this point of view. We have established both directions of the Myhill-Nerode
  1277   this point of view. We have established in Isabelle/HOL both directions 
  1278   theorem.
  1278   of the Myhill-Nerode theorem.
  1279   %
  1279   %
  1280   \begin{theorem}[The Myhill-Nerode Theorem]\mbox{}\\
  1280   \begin{theorem}[The Myhill-Nerode Theorem]\mbox{}\\
  1281   A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
  1281   A language @{text A} is regular if and only if @{thm (rhs) Myhill_Nerode}.
  1282   \end{theorem}
  1282   \end{theorem}
  1283   %
  1283   %
  1350   lemma.
  1350   lemma.
  1351 
  1351 
  1352   We briefly considered using the method Brzozowski presented in the Appendix
  1352   We briefly considered using the method Brzozowski presented in the Appendix
  1353   of~\cite{Brzozowski64} in order to prove the second direction of the
  1353   of~\cite{Brzozowski64} in order to prove the second direction of the
  1354   Myhill-Nerode theorem. There he calculates the derivatives for regular
  1354   Myhill-Nerode theorem. There he calculates the derivatives for regular
  1355   expressions and shows that there can be only finitely many of them. We could
  1355   expressions and shows that there can be only finitely many of them (if regarded equal 
       
  1356   modulo ACI). We could
  1356   have used as the tag of a string @{text s} the derivative of a regular expression
  1357   have used as the tag of a string @{text s} the derivative of a regular expression
  1357   generated with respect to @{text s}.  Using the fact that two strings are
  1358   generated with respect to @{text s}.  Using the fact that two strings are
  1358   Myhill-Nerode related whenever their derivative is the same, together with
  1359   Myhill-Nerode related whenever their derivative is the same, together with
  1359   the fact that there are only finitely many derivatives for a regular
  1360   the fact that there are only finitely many derivatives for a regular
  1360   expression would give us a similar argument as ours. However it seems not so easy to
  1361   expression would give us a similar argument as ours. However it seems not so easy to