--- a/thys2/Paper/Paper.thy Mon Mar 21 10:44:58 2022 +0000
+++ b/thys2/Paper/Paper.thy Tue Mar 22 09:40:24 2022 +0000
@@ -1421,11 +1421,22 @@
Given the growth of the
derivatives in some cases even after aggressive simplification, this
is a hard to believe fact. A similar claim about a theoretical runtime
- of @{text "O(n\<^sup>2)"} is made for the Verbatim lexer, which calculates POSIX matches and is based on
- derivatives \cite{verbatim}. In this case derivatives are not simplified.
+ of @{text "O(n\<^sup>2)"} is made for the Verbatim lexer, which calculates
+ POSIX matches and is based on
+ derivatives \cite{verbatim}. In this case derivatives are not even simplified.
+ They write: ``For a specific list of lexical rules, Verbatim has quadratic
+ theoretical time complexity with respect to the length of the input string.''
+ While their correctness proof for Verbatim is formalised in Coq, the claim about
+ the runtime complexity is only supported by emperical evidence.
+ When we
+ tried out their extracted OCaml code with our example
+ \mbox{@{text "(a + aa)\<^sup>*"}}, it took around 5 minutes to tokenise a
+ string of 40 $a$'s and that increased to approximately 19 minutes when the
+ string was 50 $a$'s long. Given that derivatives are not simplified in
+ the work on Verbatim, such numbers are not surprising.
Clearly our result of having finite
- derivatives is rather weak in this context but we think such effeciency claims
- require further scrutiny.\medskip
+ derivatives might sound rather weak in this context but we think such effeciency claims
+ really require further scrutiny.\medskip
\noindent
Our Isabelle/HOL code is available under \url{https://github.com/urbanchr/posix}.
Binary file thys2/paper.pdf has changed