Journal/Paper.thy
changeset 196 fa8d33d13cb6
parent 194 5347d7556487
child 197 cf1c17431dab
equal deleted inserted replaced
195:5bbe63876f84 196:fa8d33d13cb6
   181   HOLlight support them with libraries. Even worse, reasoning about graphs and
   181   HOLlight support them with libraries. Even worse, reasoning about graphs and
   182   matrices can be a real hassle in HOL-based theorem provers, because
   182   matrices can be a real hassle in HOL-based theorem provers, because
   183   we have to be able to combine automata.  Consider for
   183   we have to be able to combine automata.  Consider for
   184   example the operation of sequencing two automata, say $A_1$ and $A_2$, by
   184   example the operation of sequencing two automata, say $A_1$ and $A_2$, by
   185   connecting the accepting states of $A_1$ to the initial state of $A_2$:
   185   connecting the accepting states of $A_1$ to the initial state of $A_2$:
   186   %  
       
   187 
   186 
   188   \begin{center}
   187   \begin{center}
   189   \begin{tabular}{ccc}
   188   \begin{tabular}{ccc}
   190   \begin{tikzpicture}[scale=1]
   189   \begin{tikzpicture}[scale=1]
   191   %\draw[step=2mm] (-1,-1) grid (1,1);
   190   %\draw[step=2mm] (-1,-1) grid (1,1);
   250   \end{equation}
   249   \end{equation}
   251 
   250 
   252   \noindent
   251   \noindent
   253   changes the type---the disjoint union is not a set, but a set of
   252   changes the type---the disjoint union is not a set, but a set of
   254   pairs. Using this definition for disjoint union means we do not have a
   253   pairs. Using this definition for disjoint union means we do not have a
   255   single type for automata. As a result we will not be able to define a regular
   254   single type for automata. As a result we will not be able to define a
   256   language as one for which there exists an automaton that recognises all its
   255   regular language as one for which there exists an automaton that recognises
   257   strings (Definition~\ref{baddef}). This is because we cannot make a definition in HOL that is polymorphic in 
   256   all its strings (Definition~\ref{baddef}). This is because we cannot make a
   258   the state type and there is no type quantification available in HOL (unlike 
   257   definition in HOL that is polymorphic in the state type and there is no type
   259   in Coq, for example).\footnote{Slind already pointed out this problem in an email 
   258   quantification available in HOL (unlike in Coq, for example).\footnote{Slind
   260   to the HOL4 mailing list on 21st April 2005.}
   259   already pointed out this problem in an email to the HOL4 mailing list on
       
   260   21st April 2005.}
       
   261 
   261 
   262 
   262   An alternative, which provides us with a single type for automata, is to give every 
   263   An alternative, which provides us with a single type for automata, is to give every 
   263   state node an identity, for example a natural
   264   state node an identity, for example a natural
   264   number, and then be careful to rename these identities apart whenever
   265   number, and then be careful to rename these identities apart whenever
   265   connecting two automata. This results in clunky proofs
   266   connecting two automata. This results in clunky proofs
  1886   Let us return to our proof of the second direction in the Myhill-Nerode
  1887   Let us return to our proof of the second direction in the Myhill-Nerode
  1887   theorem. The point of the above calculations is to use 
  1888   theorem. The point of the above calculations is to use 
  1888   @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation. 
  1889   @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation. 
  1889 
  1890 
  1890 
  1891 
  1891   \begin{proof}[Proof of Theorem~\ref{myhillnerodetwo}]
  1892   \begin{proof}[Proof of Theorem~\ref{myhillnerodetwo} (second version)]
  1892   Using \eqref{mhders}
  1893   Using \eqref{mhders}
  1893   and \eqref{Derspders} we can easily infer that
  1894   and \eqref{Derspders} we can easily infer that
  1894 
  1895 
  1895   \begin{center}
  1896   \begin{center}
  1896   @{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "pders x r = pders y r"}
  1897   @{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "pders x r = pders y r"}
  1914   are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"},
  1915   are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"},
  1915   and we can again conclude the second part of the Myhill-Nerode theorem.
  1916   and we can again conclude the second part of the Myhill-Nerode theorem.
  1916   \end{proof}
  1917   \end{proof}
  1917 *}
  1918 *}
  1918 
  1919 
  1919 section {* Closure Properties *}
  1920 section {* Closure Properties of Regular Languages *}
  1920 
  1921 
  1921 text {*
  1922 text {*
  1922   \noindent
  1923   \noindent
  1923   The real beauty of regular languages is that they are closed
  1924   The beauty of regular languages is that they are closed under many set
  1924   under almost all set operations. Closure under union, concatenation and Kleene-star
  1925   operations. Closure under union, concatenation and Kleene-star are trivial
  1925   are trivial to establish given our definition of regularity (Definition~\ref{regular}).
  1926   to establish given our definition of regularity (recall Definition~\ref{regular}).
  1926   More interesting is the closure under complement, because
  1927   More interesting is the closure under complement, because it seems difficult
  1927   it seems difficult to construct a regular expression for the complement
  1928   to construct a regular expression for the complement language by direct
  1928   language by direct means. However the existence of such a regular expression
  1929   means. However the existence of such a regular expression can now be easily
  1929   can now be easily proved using the Myhill-Nerode theorem since 
  1930   proved using the Myhill-Nerode theorem since
  1930   
  1931 
  1931   \begin{center}
  1932   \begin{center}
  1932   @{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"}
  1933   @{term "s\<^isub>1 \<approx>A s\<^isub>2"} if and only if @{term "s\<^isub>1 \<approx>(-A) s\<^isub>2"}
  1933   \end{center}
  1934   \end{center}
  1934   
  1935   
  1935   \noindent
  1936   \noindent
  1936   holds for any strings @{text "s\<^isub>1"} and @{text
  1937   holds for any strings @{text "s\<^isub>1"} and @{text
  1937   "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"}
  1938   "s\<^isub>2"}. Therefore @{text A} and the complement language @{term "-A"}
  1938   give rise to the same partitions. So if one is finite, the other is too and the
  1939   give rise to the same partitions. So if one is finite, the other is too, and
  1939   other way around.
  1940   the other way around. Proving the existence of such a regular expression via
       
  1941   automata using the standard method would be quite involved. It includes the
       
  1942   steps: regular expression @{text "\<Rightarrow>"} non-deterministic automaton @{text
       
  1943   "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"} complement automaton @{text "\<Rightarrow>"}
       
  1944   regular expression. Clearly not something you want to formalise in a theorem
       
  1945   prover in which it is cumbersome to reason about automata.
  1940 
  1946 
  1941   Once closure under complement is established, closure under intersection
  1947   Once closure under complement is established, closure under intersection
  1942   and set difference is also easy, because
  1948   and set difference is also easy, because
  1943 
  1949 
  1944   \begin{center}
  1950   \begin{center}
  1947   @{term "A - B = - (- A \<union> B)"}
  1953   @{term "A - B = - (- A \<union> B)"}
  1948   \end{tabular}
  1954   \end{tabular}
  1949   \end{center}
  1955   \end{center}
  1950 
  1956 
  1951   \noindent
  1957   \noindent
  1952   Closure of regular languages under reversal, which means 
  1958   Closure of regular languages under reversal, that is
  1953 
  1959 
  1954   \begin{center}
  1960   \begin{center}
  1955   @{text "A\<^bsup>-1\<^esup> \<equiv> {s\<^bsup>-1\<^esup> | s \<in> A}"}
  1961   @{text "A\<^bsup>-1\<^esup> \<equiv> {s\<^bsup>-1\<^esup> | s \<in> A}"}
  1956   \end{center}
  1962   \end{center}
  1957 
  1963 
  1958   \noindent 
  1964   \noindent 
  1959   can be shown with the help of the following operation defined on regular
  1965   can be shown with the help of the following operation defined recursively over 
  1960   expressions
  1966   regular expressions
  1961 
  1967 
  1962   \begin{center}
  1968   \begin{center}
  1963   \begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
  1969   \begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
  1964   @{thm (lhs) Rev.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(1)}\\
  1970   @{thm (lhs) Rev.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(1)}\\
  1965   @{thm (lhs) Rev.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(2)}\\
  1971   @{thm (lhs) Rev.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(2)}\\
  1971   @{thm (lhs) Rev.simps(6)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(6)}\\
  1977   @{thm (lhs) Rev.simps(6)} & @{text "\<equiv>"} & @{thm (rhs) Rev.simps(6)}\\
  1972   \end{tabular}
  1978   \end{tabular}
  1973   \end{center}
  1979   \end{center}
  1974 
  1980 
  1975   \noindent
  1981   \noindent
  1976   For this operation we can so
  1982   For this operation we can show
  1977 
  1983 
  1978   \begin{center}
  1984   \begin{center}
  1979   @{text "(\<calL>(r))\<^bsup>-1\<^esup>"}~@{text "="}~@{thm (rhs) rev_lang}
  1985   @{text "(\<calL>(r))\<^bsup>-1\<^esup>"}~@{text "="}~@{thm (rhs) rev_lang}
  1980   \end{center}
  1986   \end{center}
  1981   
  1987   
  1982   \noindent
  1988   \noindent
  1983   from which closure under reversal follows.
  1989   from which closure under reversal of regular languages follows.
  1984 
  1990 
  1985   The perhaps the most surprising fact is that regular languages are closed under any 
  1991   A perhaps surprising fact is that regular languages are closed under any
  1986   left-quotient. Define 
  1992   left-quotient. Define
  1987 
  1993 
  1988   \begin{center}
  1994   \begin{center}
  1989   @{abbrev "Ders_lang B A"}
  1995   @{abbrev "Ders_lang B A"}
  1990   \end{center}
  1996   \end{center}
  1991 
  1997 
  1992   \noindent
  1998   \noindent
  1993   and assume @{text A} is regular. From this we know there exists a regular
  1999   and assume @{text B} is any language and @{text A} is regular, then @{term
  1994   expression @{text r} such that @{term "A = lang r"}. We also know that
  2000   "Ders_lang B A"} is regular. To see this consider the following argument
  1995   @{term "pders_lang B r"} is finite. By definition and Lemma~\ref{Derspders}
  2001   using partial derivatives: From @{text A} being regular we know there exists
       
  2002   a regular expression @{text r} such that @{term "A = lang r"}. We also know
       
  2003   that @{term "pders_lang B r"} is finite for every language @{text B} and 
       
  2004   regular expression @{text r} (recall Theorem~\ref{antimirov}). By definition 
       
  2005   and Lemma~\ref{Derspders} therefore
       
  2006 
  1996   
  2007   
  1997   \begin{equation}\label{eqq}
  2008   \begin{equation}\label{eqq}
  1998   @{term "Ders_lang B (lang r) = (\<Union> lang ` (pders_lang B r))"}
  2009   @{term "Ders_lang B (lang r) = (\<Union> lang ` (pders_lang B r))"}
  1999   \end{equation}
  2010   \end{equation}
  2000 
  2011 
  2001   \noindent
  2012   \noindent
  2002   Since there are only finitely many regular expressions in @{term "pders_lang B r"}
  2013   Since there are only finitely many regular expressions in @{term "pders_lang
  2003   by Theorem~\ref{antimirov}, we know that the right-hand side of \eqref{eqq}, is 
  2014   B r"}, we know by \eqref{uplus} that there exists a regular expression that
  2004   equal to @{term "lang (\<Uplus>(pders_lang B r))"} using \eqref{uplus}. Hence
  2015   the right-hand side of \eqref{eqq} is equal to \mbox{@{term "lang (\<Uplus>(pders_lang B
  2005   the regular expression @{term "pders_lang B r"} verifies that @{term "Ders_lang B A"} 
  2016   r))"}}. Thus the regular expression @{term "\<Uplus>(pders_lang B r)"} verifies that
  2006   is regular.
  2017   @{term "Ders_lang B A"} is regular.
  2007 
       
  2008 *}
  2018 *}
  2009 
  2019 
  2010 
  2020 
  2011 section {* Conclusion and Related Work *}
  2021 section {* Conclusion and Related Work *}
  2012 
  2022 
  2031   pumping lemma \cite{Kozen97}).  We can also use it to establish the standard
  2041   pumping lemma \cite{Kozen97}).  We can also use it to establish the standard
  2032   textbook results about closure properties of regular languages. Interesting
  2042   textbook results about closure properties of regular languages. Interesting
  2033   is the case of closure under complement, because it seems difficult to
  2043   is the case of closure under complement, because it seems difficult to
  2034   construct a regular expression for the complement language by direct
  2044   construct a regular expression for the complement language by direct
  2035   means. However the existence of such a regular expression can be easily
  2045   means. However the existence of such a regular expression can be easily
  2036   proved using the Myhill-Nerode theorem.  Proving the existence of such a
  2046   proved using the Myhill-Nerode theorem.  
  2037   regular expression via automata using the standard method would be quite
  2047 
  2038   involved. It includes the steps: regular expression @{text "\<Rightarrow>"}
  2048   Our insistence on regular expressions for proving the Myhill-Nerode theorem
  2039   non-deterministic automaton @{text "\<Rightarrow>"} deterministic automaton @{text "\<Rightarrow>"}
  2049   arose from the limitations of HOL on which the popular theorem provers HOL4,
  2040   complement automaton @{text "\<Rightarrow>"} regular expression.
  2050   HOLlight and Isabelle/HOL are based. In order to guarantee consistency,
  2041 
  2051   formalisations can only extend HOL by definitions that introduce a notion in
  2042 
  2052   terms of already existing concepts. A convenient definition for automata
  2043   While regular expressions are convenient in formalisations, they have some
  2053   (based on graphs) use a polymorphic type for the state nodes. This allows us
  2044   limitations. One is that there seems to be no method of calculating a
  2054   to use the standard operation of disjoint union in order to compose two
  2045   minimal regular expression (for example in terms of length) for a regular
  2055   automata. But we cannot use such a polymorphic definition of automata in HOL
  2046   language, like there is
  2056   as part of the definition for regularity of a language (a set of strings).
  2047   for automata. On the other hand, efficient regular expression matching,
  2057   Consider the following attempt
  2048   without using automata, poses no problem \cite{OwensReppyTuron09}.
  2058 
  2049   For an implementation of a simple regular expression matcher,
  2059 
  2050   whose correctness has been formally established, we refer the reader to
  2060   \begin{center}
  2051   Owens and Slind \cite{OwensSlind08}.
  2061   @{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_finite_automata (M) \<and> \<calL>(M) = A"}
  2052 
  2062   \end{center}
       
  2063 
       
  2064   \noindent
       
  2065   which means the definiens is polymorphic in the type of the automata @{text
       
  2066   "M"}, but the definiendum @{text "is_regular"} is not. Such definitions are
       
  2067   excluded in HOL, because they lead easily to inconsistencies (see
       
  2068   \cite{PittsHOL4} for a simple example). Also HOL does not contain
       
  2069   type-quantifiers which would allow us to get rid of the polymorphism by
       
  2070   quantifying over the type-variable @{text "\<alpha>"}. Therefore when defining
       
  2071   regularity in terms of automata, the only natural way out in HOL is to use state
       
  2072   nodes with an identity, for example a natural number. Unfortunatly, the
       
  2073   consequence is that we have to be careful when combining two automata so
       
  2074   that there is no clash between two such states. This makes formalisations
       
  2075   quite fiddly and unpleasant. Regular expressions proved much more convenient
       
  2076   for reasoning in HOL and we showed they can be used for establishing the
       
  2077   Myhill-Nerode theorem.
       
  2078 
       
  2079   While regular expressions are convenient, they have some limitations. One is
       
  2080   that there seems to be no method of calculating a minimal regular expression
       
  2081   (for example in terms of length) for a regular language, like there is for
       
  2082   automata. On the other hand, efficient regular expression matching, without
       
  2083   using automata, poses no problem \cite{OwensReppyTuron09}.  For an
       
  2084   implementation of a simple regular expression matcher, whose correctness has
       
  2085   been formally established, we refer the reader to Owens and Slind
       
  2086   \cite{OwensSlind08}.
  2053 
  2087 
  2054   Our formalisation consists of 780 lines of Isabelle/Isar code for the first
  2088   Our formalisation consists of 780 lines of Isabelle/Isar code for the first
  2055   direction and 460 for the second, plus around 300 lines of standard material
  2089   direction and 460 for the second, plus around 300 lines of standard material
  2056   about regular languages. The formalisation about derivatives and partial 
  2090   about regular languages. The formalisation about derivatives and partial
  2057   derivatives shown in Section~\ref{derivatives} consists of 390 lines of code.
  2091   derivatives shown in Section~\ref{derivatives} consists of 390 lines of
  2058   While this might be seen large, it should be seen
  2092   code.  The algorithm for solving equational systems, which we used in the
       
  2093   first direction, is conceptually not that complicated. Still the use of sets
       
  2094   over which the algorithm operates, means it is not as easy to formalise as
       
  2095   one might wish. It seems sets cannot be avoided since the `input' of the
       
  2096   algorithm consists of equivalence classes and we cannot see how to
       
  2097   reformulate the theory so that we can use lists, which are usually easier to
       
  2098   reason about in a theorem prover.
       
  2099 
       
  2100   While our formalisation might be seen large, it should be seen
  2059   in the context of the work done by Constable at al \cite{Constable00} who
  2101   in the context of the work done by Constable at al \cite{Constable00} who
  2060   formalised the Myhill-Nerode theorem in Nuprl using automata. They write
  2102   formalised the Myhill-Nerode theorem in Nuprl using automata. They write
  2061   that their four-member team needed something on the magnitude of 18 months
  2103   that their four-member team needed something on the magnitude of 18 months
  2062   for their formalisation. The estimate for our formalisation is that we
  2104   for their formalisation. The estimate for our formalisation is that we
  2063   needed approximately 3 months and this included the time to find our proof
  2105   needed approximately 3 months and this included the time to find our proof
  2067   formalisation in Nurpl, but from what is shown in the Nuprl Math Library
  2109   formalisation in Nurpl, but from what is shown in the Nuprl Math Library
  2068   about their development it seems substantially larger than ours. The code of
  2110   about their development it seems substantially larger than ours. The code of
  2069   ours can be found in the Mercurial Repository at
  2111   ours can be found in the Mercurial Repository at
  2070   \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.
  2112   \mbox{\url{http://www4.in.tum.de/~urbanc/regexp.html}}.
  2071 
  2113 
  2072 
       
  2073   Our proof of the first direction is very much inspired by \emph{Brzozowski's
  2114   Our proof of the first direction is very much inspired by \emph{Brzozowski's
  2074   algebraic method} used to convert a finite automaton to a regular
  2115   algebraic method} used to convert a finite automaton to a regular expression
  2075   expression \cite{Brzozowski64}. The close connection can be seen by considering the equivalence
  2116   \cite{Brzozowski64}. The close connection can be seen by considering the
  2076   classes as the states of the minimal automaton for the regular language.
  2117   equivalence classes as the states of the minimal automaton for the regular
  2077   However there are some subtle differences. Since we identify equivalence
  2118   language.  However there are some subtle differences. Since we identify
  2078   classes with the states of the automaton, then the most natural choice is to
  2119   equivalence classes with the states of the automaton, then the most natural
  2079   characterise each state with the set of strings starting from the initial
  2120   choice is to characterise each state with the set of strings starting from
  2080   state leading up to that state. Usually, however, the states are characterised as the
  2121   the initial state leading up to that state. Usually, however, the states are
  2081   strings starting from that state leading to the terminal states.  The first
  2122   characterised as the strings starting from that state leading to the
  2082   choice has consequences about how the initial equational system is set up. We have
  2123   terminal states.  The first choice has consequences about how the initial
  2083   the $\lambda$-term on our `initial state', while Brzozowski has it on the
  2124   equational system is set up. We have the $\lambda$-term on our `initial
  2084   terminal states. This means we also need to reverse the direction of Arden's
  2125   state', while Brzozowski has it on the terminal states. This means we also
  2085   Lemma.
  2126   need to reverse the direction of Arden's Lemma. We have not found anything
  2086 
  2127   in the literature about this way of proving the first direction of the
  2087   This is also where our method shines, because we can completely
  2128   Myhill-Nerode theorem, but it appears to be folklore.
  2088   side-step the standard argument \cite{Kozen97} where automata need
  2129 
  2089   to be composed, which as stated in the Introduction is not so easy
  2130   We presented two proofs for the second direction of the Myhill-Nerode
  2090   to formalise in a HOL-based theorem prover. However, it is also the
  2131   theorem. One direct proof using tagging-functions and another using partial
  2091   direction where we had to spend most of the `conceptual' time, as
  2132   derivatives. These proofs is where our method shines, because we can
  2092   our proof-argument based on tagging-functions is new for
  2133   completely side-step the standard argument \cite{Kozen97} where automata
  2093   establishing the Myhill-Nerode theorem. All standard proofs of this
  2134   need to be composed. However, it is also the direction where we had to spend
  2094   direction proceed by arguments over automata.\medskip
  2135   most of the `conceptual' time, as our first proof based on
       
  2136   tagging-functions is new for establishing the Myhill-Nerode theorem. All
       
  2137   standard proofs of this direction proceed by arguments over automata.
       
  2138   
       
  2139   Our indirect proof for the second direction arose from the interest in
       
  2140   Brzozowski's derivatives for regular expression matching. A corresponding
       
  2141   regular expression matcher has been formalised in HOL4 in
       
  2142   \cite{OwensSlind08}. In our opinion, this formalisation is considerably
       
  2143   slicker than for example the approach to regular expression matching taken
       
  2144   in \cite{Harper99} and \cite{Yi06}. While Brzozowski's derivatives lead to
       
  2145   simple regular expression matchers and he proved that there are only
       
  2146   finitely many dissimilar derivatives for every regular expression, this
       
  2147   result is not as straightforward to formalise in a theorem prover. The
       
  2148   reason is that the set of dissimilar derivatives is not defined inductively,
       
  2149   but in terms of an ACI-equivalence relation.
       
  2150 
       
  2151   
       
  2152 
       
  2153   \medskip
  2095 
  2154 
  2096   We expect that the development of Krauss \& Nipkow gets easier by
  2155   We expect that the development of Krauss \& Nipkow gets easier by
  2097   using partial derivatives.\medskip
  2156   using partial derivatives.\medskip
  2098   
  2157   
  2099   \noindent
  2158   \noindent