Paper/Paper.thy
changeset 82 14b12b5de6d3
parent 79 bba9c80735f9
child 83 f438f4dbaada
equal deleted inserted replaced
81:dc879cb59c9c 82:14b12b5de6d3
    20   pow ("_\<^bsup>_\<^esup>" [100, 100] 100) and
    20   pow ("_\<^bsup>_\<^esup>" [100, 100] 100) and
    21   Suc ("_+1" [100] 100) and
    21   Suc ("_+1" [100] 100) and
    22   quotient ("_ \<^raw:\ensuremath{\!\sslash\!}> _" [90, 90] 90) and
    22   quotient ("_ \<^raw:\ensuremath{\!\sslash\!}> _" [90, 90] 90) and
    23   REL ("\<approx>") and
    23   REL ("\<approx>") and
    24   UPLUS ("_ \<^raw:\ensuremath{\uplus}> _" [90, 90] 90) and
    24   UPLUS ("_ \<^raw:\ensuremath{\uplus}> _" [90, 90] 90) and
    25   L ("L'(_')" [0] 101) and
    25   L ("\<^raw:\ensuremath{\cal{L}}>'(_')" [0] 101) and
    26   Lam ("\<lambda>'(_')" [100] 100) and 
    26   Lam ("\<lambda>'(_')" [100] 100) and 
    27   Trn ("_, _" [100, 100] 100) and 
    27   Trn ("_, _" [100, 100] 100) and 
    28   EClass ("\<lbrakk>_\<rbrakk>\<^bsub>_\<^esub>" [100, 100] 100) and
    28   EClass ("\<lbrakk>_\<rbrakk>\<^bsub>_\<^esub>" [100, 100] 100) and
    29   transition ("_ \<^raw:\ensuremath{\stackrel{\text{>_\<^raw:}}{\Longmapsto}}> _" [100, 100, 100] 100)
    29   transition ("_ \<^raw:\ensuremath{\stackrel{\text{>_\<^raw:}}{\Longmapsto}}> _" [100, 100, 100] 100)
    30 (*>*)
    30 (*>*)
    48   regular languages are closed under complementation: one just has to exchange
    48   regular languages are closed under complementation: one just has to exchange
    49   the accepting and non-accepting states in the corresponding automaton to
    49   the accepting and non-accepting states in the corresponding automaton to
    50   obtain an automaton for the complement language.  The problem, however, lies with
    50   obtain an automaton for the complement language.  The problem, however, lies with
    51   formalising such reasoning in a HOL-based theorem prover, in our case
    51   formalising such reasoning in a HOL-based theorem prover, in our case
    52   Isabelle/HOL. Automata are build up from states and transitions that 
    52   Isabelle/HOL. Automata are build up from states and transitions that 
    53   need to be represented as graphs or matrices, neither
    53   need to be represented as graphs, matrices or functions, none
    54   of which can be defined as inductive datatype.\footnote{In some works
    54   of which can be defined as inductive datatype. 
    55   functions are used to represent state transitions, but also they are not
    55 
    56   inductive datatypes.} This means we have to build our own reasoning
    56   In case of graphs and matrices, this means we have to build our own
    57   infrastructure for them, as neither Isabelle/HOL nor HOL4 nor HOLlight support
    57   reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor
    58   them with libraries.
    58   HOLlight support them with libraries. Even worse, reasoning about graphs and
    59 
    59   matrices can be a real hassle in HOL-based theorem provers.  Consider for
    60   Even worse, reasoning about graphs and matrices can be a real hassle in HOL-based
    60   example the operation of sequencing two automata, say $A_1$ and $A_2$, by
    61   theorem provers.  Consider for example the operation of sequencing 
    61   connecting the accepting states of $A_1$ to the initial state of $A_2$:  
    62   two automata, say $A_1$ and $A_2$, by connecting the
       
    63   accepting states of $A_1$ to the initial state of $A_2$:
       
    64   
       
    65   
    62   
    66   \begin{center}
    63   \begin{center}
    67   \begin{tabular}{ccc}
    64   \begin{tabular}{ccc}
    68   \begin{tikzpicture}[scale=0.8]
    65   \begin{tikzpicture}[scale=0.8]
    69   %\draw[step=2mm] (-1,-1) grid (1,1);
    66   %\draw[step=2mm] (-1,-1) grid (1,1);
   119 
   116 
   120   \noindent
   117   \noindent
   121   On ``paper'' we can define the corresponding graph in terms of the disjoint 
   118   On ``paper'' we can define the corresponding graph in terms of the disjoint 
   122   union of the state nodes. Unfortunately in HOL, the definition for disjoint 
   119   union of the state nodes. Unfortunately in HOL, the definition for disjoint 
   123   union, namely 
   120   union, namely 
   124 
   121   %
   125   \begin{center}
   122   \begin{equation}\label{disjointunion}
   126   @{term "UPLUS A\<^isub>1 A\<^isub>2 \<equiv> {(1, x) | x. x \<in> A\<^isub>1} \<union> {(2, y) | y. y \<in> A\<^isub>2}"}
   123   @{term "UPLUS A\<^isub>1 A\<^isub>2 \<equiv> {(1, x) | x. x \<in> A\<^isub>1} \<union> {(2, y) | y. y \<in> A\<^isub>2}"}
   127   \end{center}
   124   \end{equation}
   128 
   125 
   129   \noindent
   126   \noindent
   130   changes the type---the disjoint union is not a set, but a set of pairs. 
   127   changes the type---the disjoint union is not a set, but a set of pairs. 
   131   Using this definition for disjoint unions means we do not have a single type for automata
   128   Using this definition for disjoint unions means we do not have a single type for automata
   132   and hence will not be able to state properties about \emph{all}
   129   and hence will not be able to state properties about \emph{all}
   137   connecting two automata. This results in clunky proofs
   134   connecting two automata. This results in clunky proofs
   138   establishing that properties are invariant under renaming. Similarly,
   135   establishing that properties are invariant under renaming. Similarly,
   139   connecting two automata represented as matrices results in very adhoc
   136   connecting two automata represented as matrices results in very adhoc
   140   constructions, which are not pleasant to reason about.
   137   constructions, which are not pleasant to reason about.
   141 
   138 
       
   139   Functions are much better supported in Isabelle/HOL, but they still lead to similar
       
   140   problems as with graphs.  Composing two non-deterministic automata in parallel
       
   141   poses still the problem of how to implement disjoint unions. Nipkow \cite{Nipkow98} 
       
   142   dismisses the option using identities, because it leads to messy proofs. He
       
   143   opts for a variant of \eqref{disjointunion}, but writes
       
   144 
       
   145   \begin{quote}
       
   146   \it ``If the reader finds the above treatment in terms of bit lists revoltingly
       
   147   concrete, I cannot disagree.''
       
   148   \end{quote}
       
   149   
       
   150   \noindent
       
   151   Moreover, it is not so clear how to conveniently impose a finiteness condition 
       
   152   upon functions in order to represent \emph{finite} automata. The best is
       
   153   probably to resort to more advanced reasoning frameworks, such as \emph{locales}.
       
   154 
   142   Because of these problems to do with representing automata, there seems
   155   Because of these problems to do with representing automata, there seems
   143   to be no substantial formalisation of automata theory and regular languages 
   156   to be no substantial formalisation of automata theory and regular languages 
   144   carried out in a HOL-based theorem prover. We are only aware of the 
   157   carried out in a HOL-based theorem prover. Nipkow establishes in 
   145   large formalisation of automata theory in Nuprl \cite{Constable00} and 
   158   \cite{Nipkow98} the link between regular expressions and automata in
   146   some smaller formalisations in Coq (for example \cite{Filliatre97}).
   159   the context of lexing. The only larger formalisations of automata theory 
   147   
   160   are carried out in Nuprl \cite{Constable00} and in Coq (for example 
   148   In this paper, we will not attempt to formalise automata theory, but take a completely 
   161   \cite{Filliatre97}).
   149   different approach to regular languages. Instead of defining a regular language as one 
   162   
   150   where there exists an automaton that recognises all strings of the language, we define 
   163   In this paper, we will not attempt to formalise automata theory in
   151   a regular language as:
   164   Isabelle/HOL, but take a completely different approach to regular
   152 
   165   languages. Instead of defining a regular language as one where there exists
   153   \begin{definition}[A Regular Language]
   166   an automaton that recognises all strings of the language, we define a
       
   167   regular language as:
       
   168 
       
   169   \begin{definition}
   154   A language @{text A} is \emph{regular}, provided there is a regular expression that matches all
   170   A language @{text A} is \emph{regular}, provided there is a regular expression that matches all
   155   strings of @{text "A"}.
   171   strings of @{text "A"}.
   156   \end{definition}
   172   \end{definition}
   157   
   173   
   158   \noindent
   174   \noindent
   269   @{term "ALT r r"}\hspace{1.5mm}@{text"|"}\hspace{1.5mm} 
   285   @{term "ALT r r"}\hspace{1.5mm}@{text"|"}\hspace{1.5mm} 
   270   @{term "STAR r"}
   286   @{term "STAR r"}
   271   \end{center}
   287   \end{center}
   272 
   288 
   273   \noindent
   289   \noindent
   274   The language matched by a regular expression is defined as usual:
   290   and the language matched by a regular expression is defined as:
   275 
   291 
   276   \begin{center}
   292   \begin{center}
   277   \begin{tabular}{c@ {\hspace{10mm}}c}
   293   \begin{tabular}{c@ {\hspace{10mm}}c}
   278   \begin{tabular}{rcl}
   294   \begin{tabular}{rcl}
   279   @{thm (lhs) L_rexp.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) L_rexp.simps(1)}\\
   295   @{thm (lhs) L_rexp.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) L_rexp.simps(1)}\\
   290       @{thm (rhs) L_rexp.simps(6)[where r="r"]}\\
   306       @{thm (rhs) L_rexp.simps(6)[where r="r"]}\\
   291   \end{tabular}
   307   \end{tabular}
   292   \end{tabular}
   308   \end{tabular}
   293   \end{center}
   309   \end{center}
   294 
   310 
       
   311 
       
   312 
   295 *}
   313 *}
   296 
   314 
   297 section {* Finite Partitions Imply Regularity of a Language *}
   315 section {* Finite Partitions Imply Regularity of a Language *}
   298 
   316 
   299 text {*
   317 text {*
   363   @{text "X\<^isub>n"} & @{text "="} & @{text "(Y\<^isub>n\<^isub>1, CHAR c\<^isub>n\<^isub>1) + \<dots> + (Y\<^isub>n\<^isub>q, CHAR c\<^isub>n\<^isub>q)"}\\
   381   @{text "X\<^isub>n"} & @{text "="} & @{text "(Y\<^isub>n\<^isub>1, CHAR c\<^isub>n\<^isub>1) + \<dots> + (Y\<^isub>n\<^isub>q, CHAR c\<^isub>n\<^isub>q)"}\\
   364   \end{tabular}
   382   \end{tabular}
   365   \end{center}
   383   \end{center}
   366 
   384 
   367   \noindent
   385   \noindent
   368   where the pairs @{text "(Y\<^isub>i\<^isub>j, r\<^isub>i\<^isub>j)"} stand for all transitions 
   386   where the pairs @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} stand for all transitions 
   369   @{term "Y\<^isub>i\<^isub>j \<Turnstile>r\<^isub>i\<^isub>j\<Rightarrow> X\<^isub>i"}.  The term @{text "\<lambda>(EMPTY)"} acts as a marker for the equivalence
   387   @{term "Y\<^isub>i\<^isub>j \<Turnstile>(CHAR c\<^isub>i\<^isub>j)\<Rightarrow> X\<^isub>i"}.  The term @{text "\<lambda>(EMPTY)"} acts as a marker for the equivalence
   370   class containing @{text "[]"}. (Note that we mark, roughly speaking, the
   388   class containing @{text "[]"}. (Note that we mark, roughly speaking, the
   371   single ``initial'' state in the equational system, which is different from
   389   single ``initial'' state in the equational system, which is different from
   372   the method by Brzozowski \cite{Brzozowski64}, since for his purposes he needs to mark 
   390   the method by Brzozowski \cite{Brzozowski64}, since for his purposes he needs to mark 
   373   the ``terminal'' states.) Overloading the function @{text L} for the two kinds of terms in the 
   391   the ``terminal'' states.) Overloading the function @{text L} for the two kinds of terms in the 
   374   equational system as follows
   392   equational system as follows