lexing: comparison thys2/Paper/Paper.thy

equal deleted inserted replaced

-:30c91ea7095b
+:484403cf0c9d
 bitcoded parse trees in linear time in the size of the input''
 \cite[Page 14]{Sulzmann2014}.
 Given the growth of the
 derivatives in some cases even after aggressive simplification, this
 is a hard to believe fact. A similar claim about a theoretical runtime
-of @{text "O(n\<^sup>2)"} is made for the Verbatim lexer, which calculates POSIX matches and is based on
+of @{text "O(n\<^sup>2)"} is made for the Verbatim lexer, which calculates
-derivatives \cite{verbatim}. In this case derivatives are not simplified.
+POSIX matches and is based on
+derivatives \cite{verbatim}. In this case derivatives are not even simplified.
+They write: ``For a specific list of lexical rules, Verbatim has quadratic
+theoretical time complexity with respect to the length of the input string.''
+While their correctness proof for Verbatim is formalised in Coq, the claim about
+the runtime complexity is only supported by emperical evidence.
+When we
+tried out their extracted OCaml code with our example
+\mbox{@{text "(a + aa)\<^sup>*"}}, it took around 5 minutes to tokenise a
+string of 40 $a$'s and that increased to approximately 19 minutes when the
+string was 50 $a$'s long. Given that derivatives are not simplified in
+the work on Verbatim, such numbers are not surprising.
 Clearly our result of having finite
-derivatives is rather weak in this context but we think such effeciency claims
+derivatives might sound rather weak in this context but we think such effeciency claims
-require further scrutiny.\medskip
+really require further scrutiny.\medskip
 \noindent
 Our Isabelle/HOL code is available under \url{https://github.com/urbanchr/posix}.

changeset 459	484403cf0c9d
parent 458	30c91ea7095b
child 460	6e269f557fc5