thys2/Paper/document/root.tex
changeset 397 e1b74d618f1b
parent 396 cc8e231529fb
child 398 dac6d27c99c6
--- a/thys2/Paper/document/root.tex	Tue Jan 25 13:12:50 2022 +0000
+++ b/thys2/Paper/document/root.tex	Thu Jan 27 23:25:26 2022 +0000
@@ -45,33 +45,50 @@
 \titlerunning{POSIX Lexing with Bitcoded Derivatives}
 \author{Chengsong Tan}{King's College London}{chengsong.tan@kcl.ac.uk}{}{}
 \author{Christian Urban}{King's College London}{christian.urban@kcl.ac.uk}{}{}
+\keywords{POSIX matching, Derivatives of Regular Expressions, Isabelle/HOL}
+\category{}
+\ccsdesc[100]{Design and analysis of algorithms}
+\ccsdesc[100]{Formal languages and automata theory}
+\Copyright{\mbox{}}
+\nolinenumbers
 
 
 \begin{document}
 \maketitle
 
 \begin{abstract}
-Brzozowski introduced the notion of derivatives for regular
-expressions. They can be used for a very simple regular expression
-matching algorithm.  Sulzmann and Lu cleverly extended this algorithm
-in order to deal with POSIX matching, which is the underlying
-disambiguation strategy for regular expressions needed in lexers.
-Their algorithm generates POSIX values which encode the information of
-\emph{how} a regular expression matches a string---that is, which part
-of the string is matched by which part of the regular expression.  In
-this paper we give our inductive definition of what a POSIX value is
-and show $(i)$ that such a value is unique (for given regular
-expression and string being matched) and $(ii)$ that Sulzmann and Lu's
-algorithm always generates such a value (provided that the regular
-expression matches the string). We show that $(iii)$ our inductive
-definition of a POSIX value is equivalent to an alternative definition
-by Okui and Suzuki which identifies POSIX values as least elements
-according to an ordering of values.  We also prove the correctness of
-Sulzmann's bitcoded version of the POSIX matching algorithm and extend the
-results to additional constructors for regular expressions.  \smallskip
-
-{\bf Keywords:} POSIX matching, Derivatives of Regular Expressions,
-Isabelle/HOL
+  Sulzmann and Lu described a lexing algorithm that calculates
+  Brzozowski derivatives using bit-sequences annotated to regular
+  expressions.  Their algorithm generates POSIX values which encode
+  the information of \emph{how} a regular expression matches a
+  string---that is, which part of the string is matched by which part
+  of the regular expression.  The purpose of the bit-sequences in
+  Sulzmann and Lu's algorithm is to keep the size of derivatives small
+  which is achieved by `aggressively' simplifying regular expressions.
+  In this paper we describe a slight variant of Sulzmann and Lu's
+  algorithm and \textit{(i)} prove that this algorithm generates
+  unique POSIX values; \textit{(ii)} we also establish a cubic bound
+  for the size of the derivatives---in earlier works, derivatives can
+  grow exponentially even after simplification.
+  
+%Brzozowski introduced the notion of derivatives for regular
+%expressions. They can be used for a very simple regular expression
+%matching algorithm.  Sulzmann and Lu cleverly extended this algorithm
+%in order to deal with POSIX matching, which is the underlying
+%disambiguation strategy for regular expressions needed in lexers.
+%Their algorithm generates POSIX values which encode the information of
+%\emph{how} a regular expression matches a string---that is, which part
+%of the string is matched by which part of the regular expression.  In
+%this paper we give our inductive definition of what a POSIX value is
+%and show $(i)$ that such a value is unique (for given regular
+%expression and string being matched) and $(ii)$ that Sulzmann and Lu's
+%algorithm always generates such a value (provided that the regular
+%expression matches the string). We show that $(iii)$ our inductive
+%definition of a POSIX value is equivalent to an alternative definition
+%by Okui and Suzuki which identifies POSIX values as least elements
+%according to an ordering of values.  We also prove the correctness of
+%Sulzmann's bitcoded version of the POSIX matching algorithm and extend the
+%results to additional constructors for regular expressions.  \smallskip
 \end{abstract}