Paper/Paper.thy
changeset 93 2aa3756dcc9f
parent 92 a9ebc410a5c8
child 94 5b12cd0a3b3c
equal deleted inserted replaced
92:a9ebc410a5c8 93:2aa3756dcc9f
   144   connecting two automata represented as matrices results in very adhoc
   144   connecting two automata represented as matrices results in very adhoc
   145   constructions, which are not pleasant to reason about.
   145   constructions, which are not pleasant to reason about.
   146 
   146 
   147   Functions are much better supported in Isabelle/HOL, but they still lead to similar
   147   Functions are much better supported in Isabelle/HOL, but they still lead to similar
   148   problems as with graphs.  Composing, for example, two non-deterministic automata in parallel
   148   problems as with graphs.  Composing, for example, two non-deterministic automata in parallel
   149   poses again the problem of how to implement disjoint unions. Nipkow \cite{Nipkow98} 
   149   requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98} 
   150   dismisses the option of using identities, because it leads to ``messy proofs''. He
   150   dismisses the option of using identities, because it leads according to him to ``messy proofs''. He
   151   opts for a variant of \eqref{disjointunion} using bitlists, but writes
   151   opts for a variant of \eqref{disjointunion} using bitlists, but writes
   152 
   152 
   153   \begin{quote}
   153   \begin{quote}
   154   \it ``If the reader finds the above treatment in terms of bit lists revoltingly
   154   \it%
   155   concrete, \phantom{``}I cannot disagree.''
   155   \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}}
       
   156   `` & If the reader finds the above treatment in terms of bit lists revoltingly
       
   157        concrete, I cannot disagree.''\\
       
   158   `` & All lemmas appear obvious given a picture of the composition of automata\ldots
       
   159        Yet their proofs require a painful amount of detail.''
       
   160   \end{tabular}
   156   \end{quote}
   161   \end{quote}
   157   
   162   
   158   \noindent
   163   \noindent
   159   Moreover, it is not so clear how to conveniently impose a finiteness condition 
   164   Moreover, it is not so clear how to conveniently impose a finiteness condition 
   160   upon functions in order to represent \emph{finite} automata. The best is
   165   upon functions in order to represent \emph{finite} automata. The best is
   161   probably to resort to more advanced reasoning frameworks, such as \emph{locales}
   166   probably to resort to more advanced reasoning frameworks, such as \emph{locales}
   162   or \emph{type classes},
   167   or \emph{type classes},
   163   which are not avaiable in all HOL-based theorem provers.
   168   which are not avaiable in \emph{all} HOL-based theorem provers.
   164 
   169 
   165   Because of these problems to do with representing automata, there seems
   170   Because of these problems to do with representing automata, there seems
   166   to be no substantial formalisation of automata theory and regular languages 
   171   to be no substantial formalisation of automata theory and regular languages 
   167   carried out in a HOL-based theorem prover. Nipkow establishes in 
   172   carried out in HOL-based theorem provers. Nipkow establishes in 
   168   \cite{Nipkow98} the link between regular expressions and automata in
   173   \cite{Nipkow98} the link between regular expressions and automata in
   169   the restricted context of lexing. The only larger formalisations of automata theory 
   174   the restricted context of lexing. The only larger formalisations of automata theory 
   170   are carried out in Nuprl \cite{Constable00} and in Coq (for example 
   175   are carried out in Nuprl \cite{Constable00} and in Coq (for example 
   171   \cite{Filliatre97}).
   176   \cite{Filliatre97}).
   172   
   177   
   208   Strings in Isabelle/HOL are lists of characters with the \emph{empty string}
   213   Strings in Isabelle/HOL are lists of characters with the \emph{empty string}
   209   being represented by the empty list, written @{term "[]"}.  \emph{Languages}
   214   being represented by the empty list, written @{term "[]"}.  \emph{Languages}
   210   are sets of strings. The language containing all strings is written in
   215   are sets of strings. The language containing all strings is written in
   211   Isabelle/HOL as @{term "UNIV::string set"}. The concatenation of two languages 
   216   Isabelle/HOL as @{term "UNIV::string set"}. The concatenation of two languages 
   212   is written @{term "A ;; B"} and a language raised to the power @{text n} is written 
   217   is written @{term "A ;; B"} and a language raised to the power @{text n} is written 
   213   @{term "A \<up> n"}. Their definitions are
   218   @{term "A \<up> n"}. They are defined as usual
   214 
   219 
   215   \begin{center}
   220   \begin{center}
   216   @{thm Seq_def[THEN eq_reflection, where A1="A" and B1="B"]}
   221   @{thm Seq_def[THEN eq_reflection, where A1="A" and B1="B"]}
   217   \hspace{7mm}
   222   \hspace{7mm}
   218   @{thm pow.simps(1)[THEN eq_reflection, where A1="A"]}
   223   @{thm pow.simps(1)[THEN eq_reflection, where A1="A"]}
   245   as @{text "{y | y \<approx> x}"}.
   250   as @{text "{y | y \<approx> x}"}.
   246 
   251 
   247 
   252 
   248   Central to our proof will be the solution of equational systems
   253   Central to our proof will be the solution of equational systems
   249   involving sets of languages. For this we will use Arden's lemma \cite{Brzozowski64}
   254   involving sets of languages. For this we will use Arden's lemma \cite{Brzozowski64}
   250   which solves equations of the form @{term "X = A ;; X \<union> B"} in case
   255   which solves equations of the form @{term "X = A ;; X \<union> B"} provided
   251   @{term "[] \<notin> A"}. However we will need the following ``reverse'' 
   256   @{term "[] \<notin> A"}. However we will need the following ``reverse'' 
   252   version of Arden's lemma.
   257   version of Arden's lemma.
   253 
   258 
   254   \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
   259   \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\
   255   If @{thm (prem 1) arden} then
   260   If @{thm (prem 1) arden} then
   323   \end{center}
   328   \end{center}
   324 
   329 
   325   Given a set of regular expressions @{text rs}, we will make use of the operation of generating 
   330   Given a set of regular expressions @{text rs}, we will make use of the operation of generating 
   326   a regular expression that matches all languages of @{text rs}. We only need to know the existence
   331   a regular expression that matches all languages of @{text rs}. We only need to know the existence
   327   of such a regular expression and therefore we use Isabelle/HOL's @{const "fold_graph"} and Hilbert's
   332   of such a regular expression and therefore we use Isabelle/HOL's @{const "fold_graph"} and Hilbert's
   328   @{text "\<epsilon>"} to define @{term "\<Uplus>rs"}. This function, roughly speaking, folds @{const ALT} over the 
   333   @{text "\<epsilon>"} to define @{term "\<Uplus>rs"}. This operation, roughly speaking, folds @{const ALT} over the 
   329   set @{text rs} with @{const NULL} for the empty set. We can prove that for finite sets @{text rs}
   334   set @{text rs} with @{const NULL} for the empty set. We can prove that for finite sets @{text rs}
   330 
   335 
   331   \begin{center}
   336   \begin{center}
   332   @{thm (lhs) folds_alt_simp}@{text "= \<Union> (\<calL> ` rs)"}
   337   @{thm (lhs) folds_alt_simp} @{text "= \<Union> (\<calL> ` rs)"}
   333   \end{center}
   338   \end{center}
   334 
   339 
   335   \noindent
   340   \noindent
   336   holds, whereby @{text "\<calL> ` rs"} stands for the 
   341   holds, whereby @{text "\<calL> ` rs"} stands for the 
   337   image of the set @{text rs} under function @{text "\<calL>"}.
   342   image of the set @{text rs} under function @{text "\<calL>"}.
   338   
       
   339 
       
   340 *}
   343 *}
   341 
   344 
   342 section {* Finite Partitions Imply Regularity of a Language *}
   345 section {* Finite Partitions Imply Regularity of a Language *}
   343 
   346 
   344 text {*
   347 text {*
   352   \end{definition}
   355   \end{definition}
   353 
   356 
   354   \noindent
   357   \noindent
   355   It is easy to see that @{term "\<approx>A"} is an equivalence relation, which
   358   It is easy to see that @{term "\<approx>A"} is an equivalence relation, which
   356   partitions the set of all strings, @{text "UNIV"}, into a set of disjoint
   359   partitions the set of all strings, @{text "UNIV"}, into a set of disjoint
   357   equivalence classes. An example is the regular language containing just
   360   equivalence classes. Let us give an example: consider the regular language containing just
   358   the string @{text "[c]"}. The relation @{term "\<approx>({[c]})"} partitions @{text UNIV}
   361   the string @{text "[c]"}. The relation @{term "\<approx>({[c]})"} partitions @{text UNIV}
   359   into the three equivalence classes @{text "X\<^isub>1"}, @{text "X\<^isub>2"} and  @{text "X\<^isub>3"}
   362   into the three equivalence classes @{text "X\<^isub>1"}, @{text "X\<^isub>2"} and  @{text "X\<^isub>3"}
   360   as follows
   363   as follows
   361   
   364   
   362   \begin{center}
   365   \begin{center}
   364   @{text "X\<^isub>2 = {[c]}"}\hspace{5mm}
   367   @{text "X\<^isub>2 = {[c]}"}\hspace{5mm}
   365   @{text "X\<^isub>3 = UNIV - {[], [c]}"}
   368   @{text "X\<^isub>3 = UNIV - {[], [c]}"}
   366   \end{center}
   369   \end{center}
   367 
   370 
   368   One direction of the Myhill-Nerode theorem establishes 
   371   One direction of the Myhill-Nerode theorem establishes 
   369   that if there are finitely many equivalence classes, then the language is 
   372   that if there are finitely many equivalence classes, like in the example above, then 
   370   regular. In our setting we therefore have to show:
   373   the language is regular. In our setting we therefore have to show:
   371   
   374   
   372   \begin{theorem}\label{myhillnerodeone}
   375   \begin{theorem}\label{myhillnerodeone}
   373   @{thm[mode=IfThen] hard_direction}
   376   @{thm[mode=IfThen] hard_direction}
   374   \end{theorem}
   377   \end{theorem}
   375 
   378 
   385   In our running example, @{text "X\<^isub>2"} is the only equivalence class in @{term "finals {[c]}"}.
   388   In our running example, @{text "X\<^isub>2"} is the only equivalence class in @{term "finals {[c]}"}.
   386   It is straightforward to show that in general @{thm lang_is_union_of_finals} and 
   389   It is straightforward to show that in general @{thm lang_is_union_of_finals} and 
   387   @{thm finals_in_partitions} hold. 
   390   @{thm finals_in_partitions} hold. 
   388   Therefore if we know that there exists a regular expression for every
   391   Therefore if we know that there exists a regular expression for every
   389   equivalence class in @{term "finals A"} (which by assumption must be
   392   equivalence class in @{term "finals A"} (which by assumption must be
   390   a finite set), then we can using @{text "\<bigplus>"} obtain a regular expression 
   393   a finite set), then we can use @{text "\<bigplus>"} to obtain a regular expression 
   391   using that matches every string in @{text A}.
   394   using that matches every string in @{text A}.
   392 
   395 
   393 
   396 
   394   Our proof of Thm.~\ref{myhillnerodeone} relies on a method that can calculate a
   397   Our proof of Thm.~\ref{myhillnerodeone} relies on a method that can calculate a
   395   regular expression for \emph{every} equivalence class, not just the ones 
   398   regular expression for \emph{every} equivalence class, not just the ones 
   396   in @{term "finals A"}. We
   399   in @{term "finals A"}. We
   397   first define the notion of \emph{one-character-transition} between equivalence classes
   400   first define the notion of \emph{one-character-transition} between 
       
   401   two equivalence classes
   398   %
   402   %
   399   \begin{equation} 
   403   \begin{equation} 
   400   @{thm transition_def}
   404   @{thm transition_def}
   401   \end{equation}
   405   \end{equation}
   402 
   406 
   404   which means that if we concatenate the character @{text c} to the end of all 
   408   which means that if we concatenate the character @{text c} to the end of all 
   405   strings in the equivalence class @{text Y}, we obtain a subset of 
   409   strings in the equivalence class @{text Y}, we obtain a subset of 
   406   @{text X}. Note that we do not define an automaton here, we merely relate two sets
   410   @{text X}. Note that we do not define an automaton here, we merely relate two sets
   407   (with the help of a regular expression). In our concrete example we have 
   411   (with the help of a regular expression). In our concrete example we have 
   408   @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<Rightarrow> X\<^isub>3"} with @{text d} being any 
   412   @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<Rightarrow> X\<^isub>3"} with @{text d} being any 
   409   other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>d\<Rightarrow> X\<^isub>3"} for any @{text c}.
   413   other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>d\<Rightarrow> X\<^isub>3"} for any @{text d}.
   410   
   414   
   411   Next we build an equational system that
   415   Next we build an equational system that
   412   contains an equation for each equivalence class. Suppose we have 
   416   contains an equation for each equivalence class. Suppose we have 
   413   the equivalence classes @{text "X\<^isub>1,\<dots>,X\<^isub>n"}, there must be one and only one that
   417   the equivalence classes @{text "X\<^isub>1,\<dots>,X\<^isub>n"}, there must be one and only one that
   414   contains the empty string @{text "[]"} (since equivalence classes are disjoint).
   418   contains the empty string @{text "[]"} (since equivalence classes are disjoint).
   423   \end{tabular}
   427   \end{tabular}
   424   \end{center}
   428   \end{center}
   425 
   429 
   426   \noindent
   430   \noindent
   427   where the pairs @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} stand for all transitions 
   431   where the pairs @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} stand for all transitions 
   428   @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow> X\<^isub>i"}.  There can only be finitely many such
   432   @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow> X\<^isub>i"}. Our internal represeantation for the right-hand
   429   transitions since there are only finitely many equivalence classes 
   433   sides are sets of terms.
   430   and finitely many characters.
   434   There can only be finitely many such
       
   435   terms since there are only finitely many equivalence classes 
       
   436   and only finitely many characters.
   431   The term @{text "\<lambda>(EMPTY)"} in the first equation acts as a marker for the equivalence
   437   The term @{text "\<lambda>(EMPTY)"} in the first equation acts as a marker for the equivalence
   432   class containing @{text "[]"}. (As an aside note that we mark, roughly speaking, the
   438   class containing @{text "[]"}.\footnote{Note that we mark, roughly speaking, the
   433   single ``initial'' state in the equational system, which is different from
   439   single ``initial'' state in the equational system, which is different from
   434   the method by Brzozowski \cite{Brzozowski64}: he marks the ``terminal'' 
   440   the method by Brzozowski \cite{Brzozowski64}, where he marks the ``terminal'' 
   435   states. We are forced to set up the equational system in this way, because 
   441   states. We are forced to set up the equational system in our way, because 
   436   the Myhill-Nerode relation determines the ``direction'' of the transitions. 
   442   the Myhill-Nerode relation determines the ``direction'' of the transitions. 
   437   The successor ``state'' of an equivalence class @{text Y} can be reached by adding 
   443   The successor ``state'' of an equivalence class @{text Y} can be reached by adding 
   438   characters to the end of @{text Y}. This is also the reason why we have to use 
   444   characters to the end of @{text Y}. This is also the reason why we have to use 
   439   our reverse version of Arden's lemma.) 
   445   our reverse version of Arden's lemma.}
   440   Overloading the function @{text L} for the two kinds of terms in the 
   446   Overloading the function @{text \<calL>} for the two kinds of terms in the 
   441   equational system, we have
   447   equational system, we have
   442   
   448   
   443   \begin{center}
   449   \begin{center}
   444   @{text "\<calL>(Y, r) \<equiv>"} %
   450   @{text "\<calL>(Y, r) \<equiv>"} %
   445   @{thm (rhs) L_rhs_item.simps(2)[where X="Y" and r="r", THEN eq_reflection]}\hspace{10mm}
   451   @{thm (rhs) L_rhs_item.simps(2)[where X="Y" and r="r", THEN eq_reflection]}\hspace{10mm}
   461   \end{equation}
   467   \end{equation}
   462 
   468 
   463   \noindent
   469   \noindent
   464   The reason for adding the @{text \<lambda>}-marker to our equational system is 
   470   The reason for adding the @{text \<lambda>}-marker to our equational system is 
   465   to obtain this equation: it only holds in this form since none of 
   471   to obtain this equation: it only holds in this form since none of 
   466   the other terms contain the empty string. 
   472   the other terms contain the empty string. Since we use sets for representing
   467 
   473   the right-hans side we can write \eqref{inv1} and \eqref{inv2} more
   468 
   474   concisely for an equation of the form @{text "X = rhs"} as
       
   475   %
       
   476   \begin{equation}\label{inv}
       
   477   \mbox{@{text "X = \<Union> (\<calL> ` rhs)"}}
       
   478   \end{equation}
       
   479 
       
   480   \noindent
   469   Our proof of Thm.~\ref{myhillnerodeone} will proceed by transforming the
   481   Our proof of Thm.~\ref{myhillnerodeone} will proceed by transforming the
   470   equational system into a \emph{solved form} maintaining the invariants
   482   equational system into a \emph{solved form} maintaining the invariant
   471   \eqref{inv1} and \eqref{inv2}. From the solved form we will be able to read
   483   \eqref{inv}. From the solved form we will be able to read
   472   off the regular expressions. 
   484   off the regular expressions. 
   473 
   485 
   474   In order to transform an equational system into solved form, we have two main 
   486   In order to transform an equational system into solved form, we have two main 
   475   operations: one that takes an equation of the form @{text "X = rhs"} and removes
   487   operations: one that takes an equation of the form @{text "X = rhs"} and removes
   476   the recursive occurences of @{text X} in @{text rhs} using our variant of Arden's 
   488   the recursive occurences of @{text X} in the @{text rhs} using our variant of Arden's 
   477   Lemma. The other operation takes an equation @{text "X = rhs"}
   489   Lemma. The other operation takes an equation @{text "X = rhs"}
   478   and substitutes @{text X} throughout the rest of the equational system
   490   and substitutes @{text X} throughout the rest of the equational system
   479   adjusting the remaining regular expressions approriately. To define this adjustment 
   491   adjusting the remaining regular expressions approriately. To define this adjustment 
   480   we define the \emph{append-operation} 
   492   we define the \emph{append-operation} 
   481 
   493 
   484   @{thm append_rexp.simps(1)[where r="r\<^isub>1" and rexp="r\<^isub>2", THEN eq_reflection]}
   496   @{thm append_rexp.simps(1)[where r="r\<^isub>1" and rexp="r\<^isub>2", THEN eq_reflection]}
   485   \end{center}
   497   \end{center}
   486 
   498 
   487   \noindent
   499   \noindent
   488   which we also lift to entire right-hand sides of equations, written as
   500   which we also lift to entire right-hand sides of equations, written as
   489   @{thm (lhs) append_rhs_rexp_def[where rexp="r"]}.
   501   @{thm (lhs) append_rhs_rexp_def[where rexp="r"]}. With this we can define
   490 
   502   the \emph{arden-operation}, which takes an equivalence class @{text X} and
       
   503   a rhs of an equation.
   491 
   504 
   492   \begin{center}
   505   \begin{center}
   493   @{thm arden_op_def}
   506   @{thm arden_op_def}
   494   \end{center}
   507   \end{center}
       
   508 
       
   509   \noindent
       
   510   The term left of $\triangleright$ deletes all terms of the form @{text "(X, r)"};
       
   511   the term on the right calculates first the combinded regular expressions for all
       
   512   @{text r} in the @{text "(X, r)"}, forms the star and appends it to the remaining
       
   513   terms in the @{text rhs}. It can be easily seen that this operation mimics Arden's
       
   514   lemma on the level of equations.
       
   515   
   495 *}
   516 *}
   496 
   517 
   497 section {* Regular Expressions Generate Finitely Many Partitions *}
   518 section {* Regular Expressions Generate Finitely Many Partitions *}
   498 
   519 
   499 text {*
   520 text {*