ChengsongTanPhdThesis/main.tex
changeset 653 bc5571c38d1f
parent 646 56057198e4f5
child 664 ba44144875b1
--- a/ChengsongTanPhdThesis/main.tex	Tue Jun 27 16:26:48 2023 +0100
+++ b/ChengsongTanPhdThesis/main.tex	Thu Jun 29 04:17:48 2023 +0100
@@ -291,10 +291,9 @@
 \begin{abstract}
 \addchaptertocentry{\abstractname} % Add the abstract to the table of contents
 %\addchap{Abstract} 
-\textbf{Problem: not like an abstract, more like a summary}
-\textbf{Goal for new abstract: more high-level and abstract, tell the problem and solution in a concise way.}
-
-New abstract:\\
+\marginpar{\em Reviewer feedback:  Problem: not like an abstract, more like a summary\\
+New abstract: more high-level and abstract, tell the problem and solution in a concise way.
+}
 %Modern computer systems rely on lexing for essential applications such as
 %compilers, IDEs, file systems, and network intrusion detection systems.
 %These applications require correctness with respect to
@@ -316,26 +315,26 @@
 
 
 Old abstract:
-This thesis is about regular expressions and derivatives. It combines functional algorithms and their formal verification in the Isabelle/HOL theorem prover. 
-Classic results say that regular expression matching should be 
-linear with respect to the input. The size of the regular expressions
-are often treated as a constant factor.
-However with some regular expressions and inputs, existing implementations 
-often suffer from non-linear or even exponential running time, 
-giving rise to ReDoS (regular expression denial-of-service ) attacks. 
-To avoid these attacks, regular expression matchers and lexers with 
-formalised correctness and running time related
-properties are of interest because the guarantees apply to all inputs, not just a finite
-number of empirical test cases.
-
-Sulzmann and Lu describe a lexing algorithm that calculates Brzozowski derivatives using bitcodes annotated to regular expressions. Their algorithm generates POSIX values which encode the information of how a regular expression matches a string—that is, which part of the string is matched by which part of the regular expression. This information is needed in the context of lexing in order to extract and to classify tokens. The purpose of the bitcodes is to generate POSIX values incrementally while derivatives are calculated. They also help with designing an “aggressive” simplification function that keeps the size of derivatives finitely bounded. 
-Our simplification function is more aggressive than the one by Sulzmann and Lu.
-We also fix a problem in Sulzmann and Lu's simplification to do with their use of
-the $\textit{nub}$ function which does not remove non-trivial duplicates.
-Without simplification the size of some derivatives can grow arbitrarily big resulting in an extremely slow lexing algorithm.
-In this thesis we describe a variant of Sulzmann and Lu’s algorithm: Our variant is a recursive functional program, whereas Sulzmann and Lu’s version involves a fixpoint construction. We (i) prove in Isabelle/HOL that our algorithm is correct and generates unique POSIX values; we also (ii) establish a finite bound for the size of the derivatives for every input string; we also
-(iii) give a program and a conjecture that the finite 
-bound can be improved to be cubic if stronger simplification rules are applied. 
+%This thesis is about regular expressions and derivatives. It combines functional algorithms and their formal verification in the Isabelle/HOL theorem prover. 
+%Classic results say that regular expression matching should be 
+%linear with respect to the input. The size of the regular expressions
+%are often treated as a constant factor.
+%However with some regular expressions and inputs, existing implementations 
+%often suffer from non-linear or even exponential running time, 
+%giving rise to ReDoS (regular expression denial-of-service ) attacks. 
+%To avoid these attacks, regular expression matchers and lexers with 
+%formalised correctness and running time related
+%properties are of interest because the guarantees apply to all inputs, not just a finite
+%number of empirical test cases.
+%
+%Sulzmann and Lu describe a lexing algorithm that calculates Brzozowski derivatives using bitcodes annotated to regular expressions. Their algorithm generates POSIX values which encode the information of how a regular expression matches a string—that is, which part of the string is matched by which part of the regular expression. This information is needed in the context of lexing in order to extract and to classify tokens. The purpose of the bitcodes is to generate POSIX values incrementally while derivatives are calculated. They also help with designing an “aggressive” simplification function that keeps the size of derivatives finitely bounded. 
+%Our simplification function is more aggressive than the one by Sulzmann and Lu.
+%We also fix a problem in Sulzmann and Lu's simplification to do with their use of
+%the $\textit{nub}$ function which does not remove non-trivial duplicates.
+%Without simplification the size of some derivatives can grow arbitrarily big resulting in an extremely slow lexing algorithm.
+%In this thesis we describe a variant of Sulzmann and Lu’s algorithm: Our variant is a recursive functional program, whereas Sulzmann and Lu’s version involves a fixpoint construction. We (i) prove in Isabelle/HOL that our algorithm is correct and generates unique POSIX values; we also (ii) establish a finite bound for the size of the derivatives for every input string; we also
+%(iii) give a program and a conjecture that the finite 
+%bound can be improved to be cubic if stronger simplification rules are applied.