Journal/Paper.thy
changeset 175 edc642266a82
parent 174 2b414a8a7132
child 176 6969de1eb96b
equal deleted inserted replaced
174:2b414a8a7132 175:edc642266a82
    32 abbreviation "ZERO \<equiv> Zero"
    32 abbreviation "ZERO \<equiv> Zero"
    33 abbreviation "ONE \<equiv> One"
    33 abbreviation "ONE \<equiv> One"
    34 abbreviation "ATOM \<equiv> Atom"
    34 abbreviation "ATOM \<equiv> Atom"
    35 abbreviation "PLUS \<equiv> Plus"
    35 abbreviation "PLUS \<equiv> Plus"
    36 abbreviation "TIMES \<equiv> Times"
    36 abbreviation "TIMES \<equiv> Times"
       
    37 abbreviation "TIMESS \<equiv> Times_set"
    37 abbreviation "STAR \<equiv> Star"
    38 abbreviation "STAR \<equiv> Star"
    38 
    39 
    39 
    40 
    40 notation (latex output)
    41 notation (latex output)
    41   str_eq_rel ("\<approx>\<^bsub>_\<^esub>") and
    42   str_eq_rel ("\<approx>\<^bsub>_\<^esub>") and
   128 text {*
   129 text {*
   129   \noindent
   130   \noindent
   130   Regular languages are an important and well-understood subject in Computer
   131   Regular languages are an important and well-understood subject in Computer
   131   Science, with many beautiful theorems and many useful algorithms. There is a
   132   Science, with many beautiful theorems and many useful algorithms. There is a
   132   wide range of textbooks on this subject, many of which are aimed at students
   133   wide range of textbooks on this subject, many of which are aimed at students
   133   and contain very detailed `pencil-and-paper' proofs
   134   and contain very detailed `pencil-and-paper' proofs (e.g.~\cite{Kozen97,
   134   (e.g.~\cite{Kozen97, HopcroftUllman69}). It seems natural to exercise theorem provers by
   135   HopcroftUllman69}). It seems natural to exercise theorem provers by
   135   formalising the theorems and by verifying formally the algorithms.  A
   136   formalising the theorems and by verifying formally the algorithms.  A
   136   popular choice for a theorem prover would be one based on Higher-Order Logic
   137   popular choice for a theorem prover would be one based on Higher-Order Logic
   137   (HOL), for example HOL4, HOLlight and Isabelle/HOL. For our development 
   138   (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development
   138   we will use the latter. One distinguishing feature of HOL is it's
   139   presented in this paper we will use the latter. HOL is a predicate calculus
   139   type system, which is based on Church's Simple Theory of Types \cite{Church40}.  The
   140   that allows quantification over predicate variables. Its type system is
   140   limitations of this type system are one of the underlying motivations for the 
   141   based on Church's Simple Theory of Types \cite{Church40}.  Although 
   141   work presented in this paper.
   142   many mathematical concepts can be conveniently expressed in HOL, there are some
   142 
   143   limitations that hurt badly, if one attempts a simple-minded formalisation
   143   The typical approach to regular languages is to
   144   of regular languages in it.
   144   introduce finite automata and then define everything in terms of them.  For
   145 
   145   example, a regular language is normally defined as one whose strings are
   146   The typical approach to regular languages is to introduce finite automata
   146   recognised by a finite deterministic automaton. This approach has many
   147   and then define everything in terms of them \cite{Kozen97}.  For example, 
   147   benefits. Among them is the fact that it is easy to convince oneself that
   148   a regular language is normally defined as:
   148   regular languages are closed under complementation: one just has to exchange
   149 
   149   the accepting and non-accepting states in the corresponding automaton to
   150   \begin{dfntn}\label{baddef}
   150   obtain an automaton for the complement language.  The problem, however, lies
   151   A language @{text A} is \emph{regular}, provided there is a 
   151   with formalising such reasoning in a HOL-based theorem prover. Automata are
   152   finite deterministic automaton that recognises all strings of @{text "A"}.
   152   built up from states and transitions that need to be represented as graphs,
   153   \end{dfntn}
   153   matrices or functions, none of which can be defined as an inductive
   154 
   154   datatype.
   155   \noindent  
       
   156   This approach has many benefits. Among them is the fact that it is easy to
       
   157   convince oneself that regular languages are closed under complementation:
       
   158   one just has to exchange the accepting and non-accepting states in the
       
   159   corresponding automaton to obtain an automaton for the complement language.
       
   160   The problem, however, lies with formalising such reasoning in a HOL-based
       
   161   theorem prover. Automata are built up from states and transitions that need
       
   162   to be represented as graphs, matrices or functions, none of which can be
       
   163   defined as an inductive datatype.
   155 
   164 
   156   In case of graphs and matrices, this means we have to build our own
   165   In case of graphs and matrices, this means we have to build our own
   157   reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
   166   reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
   158   HOLlight support them with libraries. Even worse, reasoning about graphs and
   167   HOLlight support them with libraries. Even worse, reasoning about graphs and
   159   matrices can be a real hassle in HOL-based theorem provers, because
   168   matrices can be a real hassle in HOL-based theorem provers, because
   229   \noindent
   238   \noindent
   230   changes the type---the disjoint union is not a set, but a set of
   239   changes the type---the disjoint union is not a set, but a set of
   231   pairs. Using this definition for disjoint union means we do not have a
   240   pairs. Using this definition for disjoint union means we do not have a
   232   single type for automata. As a result we will not be able to define a regular
   241   single type for automata. As a result we will not be able to define a regular
   233   language as one for which there exists an automaton that recognises all its
   242   language as one for which there exists an automaton that recognises all its
   234   strings, since there is no type quantification available in HOL (unlike in Coq, for
   243   strings. This is because we cannot make a definition in HOL that is polymorphic in 
   235   example).
   244   the state type and also there is no type quantification available in HOL (unlike 
       
   245   in Coq, for example).
   236 
   246 
   237   An alternative, which provides us with a single type for automata, is to give every 
   247   An alternative, which provides us with a single type for automata, is to give every 
   238   state node an identity, for example a natural
   248   state node an identity, for example a natural
   239   number, and then be careful to rename these identities apart whenever
   249   number, and then be careful to rename these identities apart whenever
   240   connecting two automata. This results in clunky proofs
   250   connecting two automata. This results in clunky proofs
   283   lexing. Berghofer and Reiter \cite{BerghoferReiter09} formalise automata
   293   lexing. Berghofer and Reiter \cite{BerghoferReiter09} formalise automata
   284   working over bit strings in the context of Presburger arithmetic.  The only
   294   working over bit strings in the context of Presburger arithmetic.  The only
   285   larger formalisations of automata theory are carried out in Nuprl
   295   larger formalisations of automata theory are carried out in Nuprl
   286   \cite{Constable00} and in Coq \cite{Filliatre97}.
   296   \cite{Constable00} and in Coq \cite{Filliatre97}.
   287 
   297 
   288   One might also consider the Myhill-Nerode theorem as well-worn stock
   298   Also one might consider automata theory as well-worn stock material where
   289   material where everything is clear. However, paper proofs of this theorem
   299   everything is crystal clear. However, paper proofs about automata often
   290   often involve subtle side-conditions which are easily overlooked, but which
   300   involve subtle side-conditions which are easily overlooked, but which make
   291   make formal reasoning rather painful. For example Kozen's proof requires
   301   formal reasoning rather painful. For example Kozen's proof of the
   292   that the automata do not have inaccessible states \cite{Kozen97}. Another
   302   Myhill-Nerode theorem requires that the automata do not have inaccessible
   293   subtle side-condition is completeness of automata: 
   303   states \cite{Kozen97}. Another subtle side-condition is completeness of
   294   automata need to have total transition functions and at most one `sink'
   304   automata: automata need to have total transition functions and at most one
   295   state from which there is no connection to a final state (Brozowski mentions
   305   `sink' state from which there is no connection to a final state (Brozowski
   296   this side-condition in connection with state complexity
   306   mentions this side-condition in connection with state complexity
   297   \cite{Brozowski10}). Such side-conditions mean that if we define a regular
   307   \cite{Brozowski10}). Such side-conditions mean that if we define a regular
   298   language as one for which there exists \emph{any} finite automaton, then we
   308   language as one for which there exists \emph{a} finite automaton that
   299   need a lemma which ensures that another equivalent can be found satisfying the
   309   recognises all its strings (Def.~\ref{baddef}), then we need a lemma which
   300   side-condition. Unfortunately, such `little' and `obvious' lemmas make 
   310   ensures that another equivalent can be found satisfying the
   301   formalisations of results in automata theory hair-pulling experiences.
   311   side-condition. Unfortunately, such `little' and `obvious' lemmas make
   302 
   312   formalisations of automata theory hair-pulling experiences.
   303   
   313 
       
   314 
   304   In this paper, we will not attempt to formalise automata theory in
   315   In this paper, we will not attempt to formalise automata theory in
   305   Isabelle/HOL nor will we attempt to formalise automata proofs from the
   316   Isabelle/HOL nor will we attempt to formalise automata proofs from the
   306   literature, but take a different approach to regular languages than is
   317   literature, but take a different approach to regular languages than is
   307   usually taken. Instead of defining a regular language as one where there
   318   usually taken. Instead of defining a regular language as one where there
   308   exists an automaton that recognises all strings of the language, we define a
   319   exists an automaton that recognises all strings of the language, we define a
   313   strings of @{text "A"}.
   324   strings of @{text "A"}.
   314   \end{dfntn}
   325   \end{dfntn}
   315   
   326   
   316   \noindent
   327   \noindent
   317   The reason is that regular expressions, unlike graphs, matrices and
   328   The reason is that regular expressions, unlike graphs, matrices and
   318   functions, can be easily defined as an inductive datatype. No side-conditions
   329   functions, can be easily defined as an inductive datatype. A reasoning
   319   will be needed for regular expressions. Moreover, a reasoning infrastructure 
   330   infrastructure (like induction and recursion) comes then for free in
   320   (like induction and recursion) comes for free in HOL-based theorem provers. 
   331   HOL. Moreover, no side-conditions will be needed for regular expressions,
   321   This has recently been exploited in HOL4 with a formalisation of
   332   like we usually need for automata. This convenience of regular expressions has
   322   regular expression matching based on derivatives \cite{OwensSlind08} and
   333   recently been exploited in HOL4 with a formalisation of regular expression
   323   with an equivalence checker for regular expressions in Isabelle/HOL
   334   matching based on derivatives \cite{OwensSlind08} and with an equivalence
   324   \cite{KraussNipkow11}.  The main purpose of this paper is to show that a central
   335   checker for regular expressions in Isabelle/HOL \cite{KraussNipkow11}.  The
   325   result about regular languages---the Myhill-Nerode theorem---can be
   336   main purpose of this paper is to show that a central result about regular
   326   recreated by only using regular expressions. This theorem gives necessary
   337   languages---the Myhill-Nerode theorem---can be recreated by only using
   327   and sufficient conditions for when a language is regular. As a corollary of
   338   regular expressions. This theorem gives necessary and sufficient conditions
   328   this theorem we can easily establish the usual closure properties, including
   339   for when a language is regular. As a corollary of this theorem we can easily
   329   complementation, for regular languages.\medskip
   340   establish the usual closure properties, including complementation, for
       
   341   regular languages.\medskip
   330   
   342   
   331   \noindent 
   343   \noindent 
   332   {\bf Contributions:} There is an extensive literature on regular
   344   {\bf Contributions:} There is an extensive literature on regular languages.
   333   languages.  To our best knowledge, our proof of the Myhill-Nerode theorem is
   345   To our best knowledge, our proof of the Myhill-Nerode theorem is the first
   334   the first that is based on regular expressions, only. The part of this
   346   that is based on regular expressions, only. The part of this theorem stating
   335   theorem stating that finitely many partitions imply regularity of the
   347   that finitely many partitions imply regularity of the language is proved by
   336   language is proved by an argument about solving equational sytems.  This
   348   an argument about solving equational sytems.  This argument appears to be
   337   argument appears to be folklore. For the other part, we give two proofs: one
   349   folklore. For the other part, we give two proofs: one direct proof using
   338   direct proof using certain tagging-functions, and another indirect proof
   350   certain tagging-functions, and another indirect proof using Antimirov's
   339   using Antimirov's partial derivatives \cite{Antimirov95}. Again to our best
   351   partial derivatives \cite{Antimirov95}. Again to our best knowledge, the
   340   knowledge, the tagging-functions have not been used before to establish the
   352   tagging-functions have not been used before to establish the Myhill-Nerode
   341   Myhill-Nerode theorem. Derivatives of regular expressions have been used
   353   theorem. Derivatives of regular expressions have been used widely in the
   342   extensively in the literature, unlike partial derivatives. However, partial
   354   literature about regular expressions. However, partial derivatives are more
   343   derivatives are more suitable in the context of the Myhill-Nerode theorem,
   355   suitable in the context of the Myhill-Nerode theorem, since it is easier to
   344   since it is easier to establish formally their finiteness result.
   356   establish formally their finiteness result.
       
   357 
   345 *}
   358 *}
   346 
   359 
   347 section {* Preliminaries *}
   360 section {* Preliminaries *}
   348 
   361 
   349 text {*
   362 text {*
  1542   \noindent
  1555   \noindent
  1543   Without the indentification, we unfortunately obtain infinitely many
  1556   Without the indentification, we unfortunately obtain infinitely many
  1544   derivations (an example is given in \cite[Page~141]{Sakarovitch09}).
  1557   derivations (an example is given in \cite[Page~141]{Sakarovitch09}).
  1545   Reasoning modulo ACI can be done, but it is very painful in a theorem prover.
  1558   Reasoning modulo ACI can be done, but it is very painful in a theorem prover.
  1546 
  1559 
  1547 
  1560   Fortunately, there is a much simpler approach using partial
  1548   in order to prove the second
  1561   derivatives introduced by Antimirov \cite{Antimirov95}.
  1549   direction of the Myhill-Nerode theorem. There he calculates the
  1562 
  1550   derivatives for regular expressions and shows that for every
  1563   \begin{center}
  1551   language there can be only finitely many of them %derivations (if
  1564   \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}}
  1552   regarded equal modulo ACI). We could have used as tagging-function
  1565   @{thm (lhs) pder.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) pder.simps(1)}\\
  1553   the set of derivatives of a regular expression with respect to a
  1566   @{thm (lhs) pder.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) pder.simps(2)}\\
  1554   language.  Using the fact that two strings are Myhill-Nerode related
  1567   @{thm (lhs) pder.simps(3)[where c'="d"]}  & @{text "\<equiv>"} & @{thm (rhs) pder.simps(3)[where c'="d"]}\\
  1555   whenever their derivative is the same, together with the fact that
  1568   @{thm (lhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}  
  1556   there are only finitely such derivatives would give us a similar
  1569      & @{text "\<equiv>"} & @{thm (rhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\
  1557   argument as ours. However it seems not so easy to calculate the set
  1570   @{thm (lhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}  
  1558   of derivatives modulo ACI. Therefore we preferred our direct method
  1571      & @{text "\<equiv>"}\\ 
  1559   of using tagging-functions. 
  1572   \multicolumn{3}{@ {\hspace{20mm}}l@ {}}{@{thm (rhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}}\\
  1560 
  1573   @{thm (lhs) pder.simps(6)}  & @{text "\<equiv>"} & @{thm (rhs) pder.simps(6)}\smallskip\\
  1561   The problem of finiteness modulo ACI can be avoided by using partial
  1574   @{thm (lhs) pders.simps(1)}  & @{text "\<equiv>"} & @{thm (rhs) pders.simps(1)}\\
  1562   derivatives introduced by Antimirov \cite{Antimirov}.
  1575   @{thm (lhs) pders.simps(2)}  & @{text "\<equiv>"} & @{thm (rhs) pders.simps(2)}\\
       
  1576   \end{tabular}
       
  1577   \end{center}
  1563 
  1578 
  1564 *}
  1579 *}
  1565 
  1580 
  1566 section {* Closure Properties *}
  1581 section {* Closure Properties *}
  1567 
  1582