Paper/Paper.thy
changeset 116 342983676c8f
parent 115 c5f138b5fc88
child 117 22ba25b808c8
equal deleted inserted replaced
115:c5f138b5fc88 116:342983676c8f
   269 
   269 
   270   Central to our proof will be the solution of equational systems
   270   Central to our proof will be the solution of equational systems
   271   involving equivalence classes of languages. For this we will use Arden's lemma \cite{Brzozowski64},
   271   involving equivalence classes of languages. For this we will use Arden's lemma \cite{Brzozowski64},
   272   which solves equations of the form @{term "X = A ;; X \<union> B"} provided
   272   which solves equations of the form @{term "X = A ;; X \<union> B"} provided
   273   @{term "[] \<notin> A"}. However we will need the following `reverse' 
   273   @{term "[] \<notin> A"}. However we will need the following `reverse' 
   274   version of Arden's lemma.
   274   version of Arden's lemma (`reverse' in the sense of changing the order of @{text "\<cdot>"}).
   275 
   275 
   276   \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
   276   \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
   277   If @{thm (prem 1) arden} then
   277   If @{thm (prem 1) arden} then
   278   @{thm (lhs) arden} if and only if
   278   @{thm (lhs) arden} if and only if
   279   @{thm (rhs) arden}.
   279   @{thm (rhs) arden}.
   736   @{thm (concl) Subst_all_satisfies_invariant}
   736   @{thm (concl) Subst_all_satisfies_invariant}
   737   \end{center}
   737   \end{center}
   738 
   738 
   739   \noindent
   739   \noindent
   740   Finiteness is straightforward, as @{const Subst} and @{const Arden} operations 
   740   Finiteness is straightforward, as @{const Subst} and @{const Arden} operations 
   741   keep the equational system finite. These operation also preserve soundness 
   741   keep the equational system finite. These operations also preserve soundness 
   742   and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
   742   and distinctness (we proved soundness for @{const Arden} in Lem.~\ref{ardenable}).
   743   The property ardenable is clearly preserved because the append-operation
   743   The property ardenable is clearly preserved because the append-operation
   744   cannot make a regular expression to match the empty string. Validity is
   744   cannot make a regular expression to match the empty string. Validity is
   745   given because @{const Arden} removes an equivalence class from @{text yrhs}
   745   given because @{const Arden} removes an equivalence class from @{text yrhs}
   746   and then @{const Subst_all} removes @{text Y} from the equational system.
   746   and then @{const Subst_all} removes @{text Y} from the equational system.
   846 
   846 
   847 
   847 
   848 section {* Myhill-Nerode, Second Part *}
   848 section {* Myhill-Nerode, Second Part *}
   849 
   849 
   850 text {*
   850 text {*
       
   851   We will prove in this section the second part of the Myhill-Nerode
       
   852   theorem. It can be formulated in our setting as follows.
   851 
   853 
   852   \begin{theorem}
   854   \begin{theorem}
   853   Given @{text "r"} is a regular expressions, then @{thm Myhill_Nerode2}.
   855   Given @{text "r"} is a regular expressions, then @{thm Myhill_Nerode2}.
   854   \end{theorem}  
   856   \end{theorem}  
   855 
   857 
   856   \begin{proof}
   858   \noindent
   857   By induction on the structure of @{text r}. The cases for @{const NULL}, @{const EMPTY}
   859   The proof will be by induction on the structure of @{text r}. It turns out
   858   and @{const CHAR} are straightforward, because we can easily establish
   860   the base cases are straightforward.
       
   861 
       
   862 
       
   863   \begin{proof}[Base Cases]
       
   864   The cases for @{const NULL}, @{const EMPTY} and @{const CHAR} are routine, because 
       
   865   we can easily establish
   859 
   866 
   860   \begin{center}
   867   \begin{center}
   861   \begin{tabular}{l}
   868   \begin{tabular}{l}
   862   @{thm quot_null_eq}\\
   869   @{thm quot_null_eq}\\
   863   @{thm quot_empty_subset}\\
   870   @{thm quot_empty_subset}\\
   864   @{thm quot_char_subset}
   871   @{thm quot_char_subset}
   865   \end{tabular}
   872   \end{tabular}
   866   \end{center}
   873   \end{center}
   867 
   874 
       
   875   \noindent
       
   876   hold, which shows that @{term "UNIV // \<approx>(L r)"} must be finite.\qed
   868   \end{proof}
   877   \end{proof}
   869 
   878 
       
   879   \noindent
       
   880   Much more interesting are the inductive cases, which seem hard to be solved 
       
   881   directly. The reader is invited to try. Our method will rely on some
       
   882   \emph{tagging} functions of strings. Given the inductive hypothesis, it will 
       
   883   be easy to prove that the range of these tagging functions is finite.
   870 
   884 
   871   @{thm tag_str_ALT_def[where ?L1.0="A" and ?L2.0="B"]}
   885   @{thm tag_str_ALT_def[where ?L1.0="A" and ?L2.0="B"]}
   872 
   886 
   873   @{thm tag_str_SEQ_def[where ?L1.0="A" and ?L2.0="B"]}
   887   @{thm tag_str_SEQ_def[where ?L1.0="A" and ?L2.0="B"]}
   874  
   888  
   906   partitions.  Proving the existence of such a regular expression via automata would 
   920   partitions.  Proving the existence of such a regular expression via automata would 
   907   be quite involved. It includes the
   921   be quite involved. It includes the
   908   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
   922   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
   909   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
   923   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
   910   regular expression.
   924   regular expression.
       
   925 
       
   926   While regular expressions are convenient in formalisations, they have some
       
   927   limitations. One is that there seems to be no notion of a minimal regular
       
   928   expression, like there is for automata. For an implementation of a simple
       
   929   regular expression matcher, whose correctness has been formally
       
   930   established, we refer the reader to Owens and Slind \cite{OwensSlind08}.
       
   931 
   911 
   932 
   912   Our formalisation consists of ??? lines of Isabelle/Isar code for the first
   933   Our formalisation consists of ??? lines of Isabelle/Isar code for the first
   913   direction and ??? for the second. While this might be seen as too large to
   934   direction and ??? for the second. While this might be seen as too large to
   914   count as a concise proof pearl, this should be seen in the context of the
   935   count as a concise proof pearl, this should be seen in the context of the
   915   work done by Constable at al \cite{Constable00} who formalised the
   936   work done by Constable at al \cite{Constable00} who formalised the
   958   is also where our method shines, because we can completely side-step the
   979   is also where our method shines, because we can completely side-step the
   959   standard argument \cite{Kozen97} where automata need to be composed, which
   980   standard argument \cite{Kozen97} where automata need to be composed, which
   960   as stated in the Introduction is not so convenient to formalise in a 
   981   as stated in the Introduction is not so convenient to formalise in a 
   961   HOL-based theorem prover.
   982   HOL-based theorem prover.
   962 
   983 
   963   While regular expressions are convenient in formalisations, they have some
       
   964   limitations. One is that there seems to be no notion of a minimal regular
       
   965   expression, like there is for automata. For an implementation of a simple
       
   966   regular expression matcher, whose correctness has been formally
       
   967   established, we refer the reader to Owens and Slind \cite{OwensSlind08}.
       
   968 
       
   969 *}
   984 *}
   970 
   985 
   971 
   986 
   972 (*<*)
   987 (*<*)
   973 end
   988 end