thys2/Paper/Paper.thy
changeset 400 46e5566ad4ba
parent 398 dac6d27c99c6
child 402 1612f2a77bf6
equal deleted inserted replaced
399:d73f19be3ce6 400:46e5566ad4ba
    32 
    32 
    33 section {* Introduction *}
    33 section {* Introduction *}
    34 
    34 
    35 text {*
    35 text {*
    36 
    36 
    37 Brzozowski \cite{Brzozowski1964} introduced the notion of the {\em
    37 In the last fifteen or so years, Brzozowski's derivatives of regular
    38 derivative} @{term "der c r"} of a regular expression \<open>r\<close> w.r.t.\
    38 expressions have sparked quite a bit of interest in the functional
    39 a character~\<open>c\<close>, and showed that it gave a simple solution to the
    39 programming and theorem prover communities.  The beauty of
    40 problem of matching a string @{term s} with a regular expression @{term
    40 Brzozowski's derivatives \cite{Brzozowski1964} is that they are neatly
    41 r}: if the derivative of @{term r} w.r.t.\ (in succession) all the
    41 expressible in any functional language, and easily definable and
    42 characters of the string matches the empty string, then @{term r}
    42 reasoned about in theorem provers---the definitions just consist of
    43 matches @{term s} (and {\em vice versa}). The derivative has the
    43 inductive datatypes and simple recursive functions. A mechanised
    44 property (which may almost be regarded as its specification) that, for
    44 correctness proof of Brzozowski's matcher in for example HOL4 has been
    45 every string @{term s} and regular expression @{term r} and character
    45 mentioned by Owens and Slind~\cite{Owens2008}. Another one in
    46 @{term c}, one has @{term "cs \<in> L(r)"} if and only if \mbox{@{term "s \<in> L(der c r)"}}. 
    46 Isabelle/HOL is part of the work by Krauss and Nipkow
    47 The beauty of Brzozowski's derivatives is that
    47 \cite{Krauss2011}.  And another one in Coq is given by Coquand and
    48 they are neatly expressible in any functional language, and easily
    48 Siles \cite{Coquand2012}.
    49 definable and reasoned about in theorem provers---the definitions just
    49 
    50 consist of inductive datatypes and simple recursive functions. A
    50 
    51 mechanised correctness proof of Brzozowski's matcher in for example HOL4
    51 The notion of derivatives
    52 has been mentioned by Owens and Slind~\cite{Owens2008}. Another one in
    52 \cite{Brzozowski1964}, written @{term "der c r"}, of a regular
    53 Isabelle/HOL is part of the work by Krauss and Nipkow \cite{Krauss2011}.
    53 expression give a simple solution to the problem of matching a string
    54 And another one in Coq is given by Coquand and Siles \cite{Coquand2012}.
    54 @{term s} with a regular expression @{term r}: if the derivative of
       
    55 @{term r} w.r.t.\ (in succession) all the characters of the string
       
    56 matches the empty string, then @{term r} matches @{term s} (and {\em
       
    57 vice versa}). The derivative has the property (which may almost be
       
    58 regarded as its specification) that, for every string @{term s} and
       
    59 regular expression @{term r} and character @{term c}, one has @{term
       
    60 "cs \<in> L(r)"} if and only if \mbox{@{term "s \<in> L(der c r)"}}.
       
    61 
    55 
    62 
    56 If a regular expression matches a string, then in general there is more
    63 If a regular expression matches a string, then in general there is more
    57 than one way of how the string is matched. There are two commonly used
    64 than one way of how the string is matched. There are two commonly used
    58 disambiguation strategies to generate a unique answer: one is called
    65 disambiguation strategies to generate a unique answer: one is called
    59 GREEDY matching \cite{Frisch2004} and the other is POSIX
    66 GREEDY matching \cite{Frisch2004} and the other is POSIX
   156 
   163 
   157 
   164 
   158 section {* Simplification *}
   165 section {* Simplification *}
   159 
   166 
   160 text {*
   167 text {*
       
   168      Sulzmann \& Lu apply simplification via a fixpoint operation; also does not use erase to filter out duplicates.
       
   169 
   161    not direct correspondence with PDERs, because of example
   170    not direct correspondence with PDERs, because of example
   162    problem with retrieve 
   171    problem with retrieve 
   163 
   172 
   164    correctness
   173    correctness
   165 
   174