# HG changeset patch # User Chengsong # Date 1670458796 0 # Node ID bdb3ffdd39f893532fefa73b2cd24d9353846a67 # Parent d50a309a0645afbfbd0ed4c74876c6f0fdd1982f more diff -r d50a309a0645 -r bdb3ffdd39f8 ChengsongTanPhdThesis/Chapters/Introduction.tex --- a/ChengsongTanPhdThesis/Chapters/Introduction.tex Mon Dec 05 17:39:10 2022 +0000 +++ b/ChengsongTanPhdThesis/Chapters/Introduction.tex Thu Dec 08 00:19:56 2022 +0000 @@ -1245,8 +1245,15 @@ \noindent With a formal finiteness bound in place, we can greatly reduce the attack surface of servers in terms of ReDoS attacks. +The Isabelle/HOL code for our formalisation can be +found at +\begin{center} +\url{https://github.com/hellotommmy/posix} +\end{center} Further improvements to the algorithm with an even stronger version of -simplification can be made. +simplification can be made. We conjecture that the resulting size of derivatives +can be bounded by a cubic bound w.r.t. the size of the regular expression. + @@ -1267,10 +1274,11 @@ its correctness proof . In chapter \ref{Finite} we give the second guarantee of our bitcoded algorithm, that is a finite bound on the size of any -regular expression's derivatives. +regular expression's derivatives. +We also show how one can extend the +algorithm to include bounded repetitions. In chapter \ref{Cubic} we discuss stronger simplification rules which -improve the finite bound to a polynomial bound, and also show how one can extend the -algorithm to include bounded repetitions. %and the NOT regular expression. +improve the finite bound to a cubic bound.%and the NOT regular expression. diff -r d50a309a0645 -r bdb3ffdd39f8 ChengsongTanPhdThesis/Chapters/RelatedWork.tex --- a/ChengsongTanPhdThesis/Chapters/RelatedWork.tex Mon Dec 05 17:39:10 2022 +0000 +++ b/ChengsongTanPhdThesis/Chapters/RelatedWork.tex Thu Dec 08 00:19:56 2022 +0000 @@ -120,11 +120,16 @@ causes the transitivity of their ordering to not go through. -\subsection{Static Analysis of Evil Regex Patterns} -When a regular expression does not behave as intended, -people usually try to rewrite the regex to some equivalent form +When a regular expression engine does not behave as intended (for example, +getting excruciatingly slow on +a relatively simple regex pattern $(a^*)^*b$ +and a string $a\ldots a$ that is obviously +not part of that regex) +programmers usually try to rewrite the regex to some equivalent form. + or they try to avoid the possibly problematic patterns completely, for which many false positives exist\parencite{Davis18}. +\subsection{Static Analysis of Evil Regex Patterns} Animated tools to "debug" regular expressions such as \parencite{regexploit2021} \parencite{regex101} are also popular. We are also aware of static analysis work on regular expressions that