diff -r 8bbe2468fedc -r 1612f2a77bf6 thys2/Paper/document/root.tex --- a/thys2/Paper/document/root.tex Sun Jan 30 01:03:26 2022 +0000 +++ b/thys2/Paper/document/root.tex Sun Jan 30 21:21:24 2022 +0000 @@ -63,20 +63,19 @@ expressions. Their algorithm generates POSIX values which encode the information of \emph{how} a regular expression matches a string---that is, which part of the string is matched by which part - of the regular expression. The purpose of the bitcodes in Sulzmann - and Lu's algorithm is to generate POSIX values incrementally while - derivatives are calculated. However they also help with designing - `aggressive' simplification functions that keep the size of + of the regular expression. The purpose of the bitcodes is to generate POSIX values incrementally while + derivatives are calculated. They also help with designing + an `aggressive' simplification function that keeps the size of derivatives small. Without simplification derivatives can grow exponentially resulting in an extremely slow lexing algorithm. In this paper we describe a variant of Sulzmann and Lu's algorithm: Our - algorithm is a small, recursive functional program, whereas Sulzmann + algorithm is a recursive functional program, whereas Sulzmann and Lu's version involves a fixpoint construction. We \textit{(i)} prove in Isabelle/HOL that our program is correct and generates unique POSIX values; we also \textit{(ii)} establish a polynomial bound for the size of the derivatives. The size can be seen as a - proxy measure for the efficiency of the lexing algorithm---that means - because of the polynomial bound our algorithm does not suffer from + proxy measure for the efficiency of the lexing algorithm: because of + the polynomial bound our algorithm does not suffer from the exponential blowup in earlier works. % Brzozowski introduced the notion of derivatives for regular