--- a/thys2/Paper/document/root.tex Tue Feb 08 01:25:26 2022 +0000
+++ b/thys2/Paper/document/root.tex Tue Feb 08 14:29:41 2022 +0000
@@ -63,7 +63,7 @@
\author{Chengsong Tan}{King's College London}{chengsong.tan@kcl.ac.uk}{}{}
\author{Christian Urban}{King's College London}{christian.urban@kcl.ac.uk}{}{}
\authorrunning{C.~Tan and C.~Urban}
-\keywords{POSIX matching, Derivatives of Regular Expressions, Isabelle/HOL}
+\keywords{POSIX matching and lexing, Derivatives of Regular Expressions, Isabelle/HOL}
\category{}
\ccsdesc[100]{Design and analysis of algorithms}
\ccsdesc[100]{Formal languages and automata theory}
@@ -75,15 +75,17 @@
\maketitle
\begin{abstract}
- Sulzmann and Lu described a lexing algorithm that calculates
+ Sulzmann and Lu describe a lexing algorithm that calculates
Brzozowski derivatives using bitcodes annotated to regular
expressions. Their algorithm generates POSIX values which encode
the information of \emph{how} a regular expression matches a
string---that is, which part of the string is matched by which part
- of the regular expression. The purpose of the bitcodes is to generate POSIX values incrementally while
+ of the regular expression. This information is needed in the
+ context of lexing in order to extract and to classify tokens.
+ The purpose of the bitcodes is to generate POSIX values incrementally while
derivatives are calculated. They also help with designing
an ``aggressive'' simplification function that keeps the size of
- derivatives finite. Without simplification the size derivatives can grow
+ derivatives finite. Without simplification the size of some derivatives can grow
arbitrarily big resulting in an extremely slow lexing algorithm. In this
paper we describe a variant of Sulzmann and Lu's algorithm: Our
algorithm is a recursive functional program, whereas Sulzmann