ChengsongTanPhdThesis/main.tex
changeset 526 cb702fb4227f
parent 519 856d025dbc15
child 527 2c907b118f78
--- a/ChengsongTanPhdThesis/main.tex	Sat May 28 01:04:56 2022 +0100
+++ b/ChengsongTanPhdThesis/main.tex	Sat May 28 16:29:32 2022 +0100
@@ -211,24 +211,23 @@
 Theoretical results say that regular expression matching
 should be linear with respect to the input.
 Under a certain class of regular expressions and inputs though,
-practical implementations  suffer from non-linear or even 
+practical implementations often suffer from non-linear or even 
 exponential running time,
-allowing a ReDoS (regular expression denial-of-service ) attack.
-
+allowing ReDoS (regular expression denial-of-service ) attacks.
+The reason behind this is that regex libraries in popular programming
+languages often want to support richer constructs
+than the usual basic regular expressions. Unfortunately, this means that even simple cases
+are "infected" with atrocious running time.
+ 
 
-The reason behind is that regex libraries in popular language engines
- often want to support richer constructs
-than  the most basic regular expressions, and lexing rather than matching
-is needed for sub-match extraction.
-
-This work aims to address the above vulnerability by the combination
+This work aims to address the above vulnerability by a combination
 of Brzozowski's derivatives and interactive theorem proving. We give an 
 improved version of  Sulzmann and Lu's bit-coded algorithm using 
-derivatives, which come with a formal guarantee in terms of correctness and 
-running time as an Isabelle/HOL proof.
-Then we improve the algorithm with an even stronger version of 
+derivatives, and give a formal guarantee in terms of correctness and 
+size of derivatives, which we formalised in Isabelle/HOL.
+Then we improve the algorithm with a stronger version of 
 simplification, and prove a time bound linear to input and
-cubic to regular expression size using a technique by
+cubic to regular expression size using a technique introduced by
 Antimirov.
 The result is an algorithm with provable guarantees on 
 correctness and running time. We believe this is the first