lexing: comparison thys3/Paper.thy

equal deleted inserted replaced

-:57e33978e55d
+:c92a41d9c4da
 ONE ("\<^bold>1" 81) and
 CH ("_" [1000] 80) and
 ALT ("_ + _" [77,77] 78) and
 SEQ ("_ \<cdot> _" [77,77] 78) and
 STAR ("_\<^sup>*" [79] 78) and
+NTIMES ("_\<^sup>{\<^sup>_\<^sup>}" [79] 78) and
 val.Void ("Empty" 78) and
 val.Char ("Char _" [1000] 78) and
 val.Left ("Left _" [79] 78) and
 val.Right ("Right _" [1000] 78) and
 expressions have sparked quite a bit of interest in the functional
 programming and theorem prover communities.  The beauty of
 Brzozowski's derivatives \cite{Brzozowski1964} is that they are neatly
 expressible in any functional language, and easily definable and
 reasoned about in theorem provers---the definitions just consist of
-inductive datatypes and simple recursive functions.  Derivatives of a
+inductive datatypes and simple recursive functions. Another neat
+feature of derivatives is that they can be easily extended to bounded
+regular expressions, such as @{term "NTIMES r n"}, where numbers or
+intervals specify how many times a regular expression should be used
+during matching.
+Derivatives of a
 regular expression, written @{term "der c r"}, give a simple solution
 to the problem of matching a string @{term s} with a regular
 expression @{term r}: if the derivative of @{term r} w.r.t.\ (in
 succession) all the characters of the string matches the empty string,
 then @{term r} matches @{term s} (and {\em vice versa}).  We are aware
 @{const "ZERO"} $\mid$
 @{const "ONE"} $\mid$
 @{term "CH c"} $\mid$
 @{term "ALT r\<^sub>1 r\<^sub>2"} $\mid$
 @{term "SEQ r\<^sub>1 r\<^sub>2"} $\mid$
-@{term "STAR r"}
+@{term "STAR r"} $\mid$
+@{term "NTIMES r n"}
 \end{center}
 \noindent where @{const ZERO} stands for the regular expression that does
 not match any string, @{const ONE} for the regular expression that matches
 only the empty string and @{term c} for matching a character literal.
 The constructors $+$ and $\cdot$ represent alternatives and sequences, respectively.
 We sometimes omit the $\cdot$ in a sequence regular expression for brevity.
-The
+In our work here we also add to the usual ``basic'' regular expressions
+the bounded regular expression @{term "NTIMES r n"} where the @{term n}
+specifies that @{term r} should match exactly @{term n}-times. For
+brevity we omit the other bounded regular expressions
+@{text "r"}$^{\{..n\}}$, @{text "r"}$^{\{n..\}}$ and @{text "r"}$^{\{n..m\}}$
+which specify an interval for how many times @{text r} should match. Our
+results extend straightforwardly also to them. The
 \emph{language} of a regular expression, written $L(r)$, is defined as usual
 and we omit giving the definition here (see for example \cite{AusafDyckhoffUrban2016}).
 Central to Brzozowski's regular expression matcher are two functions
 called @{text nullable} and \emph{derivative}. The latter is written

changeset 563	c92a41d9c4da
parent 502	1ab693d6342f
child 569	5af61c89f51e