# HG changeset patch # User Christian Urban # Date 1647942024 0 # Node ID 484403cf0c9d1e8a202eeda06e252225c6d49e74 # Parent 30c91ea7095b64b19818d0a4f1cf2a145657cd1c updated diff -r 30c91ea7095b -r 484403cf0c9d thys2/Paper/Paper.thy --- a/thys2/Paper/Paper.thy Mon Mar 21 10:44:58 2022 +0000 +++ b/thys2/Paper/Paper.thy Tue Mar 22 09:40:24 2022 +0000 @@ -1421,11 +1421,22 @@ Given the growth of the derivatives in some cases even after aggressive simplification, this is a hard to believe fact. A similar claim about a theoretical runtime - of @{text "O(n\<^sup>2)"} is made for the Verbatim lexer, which calculates POSIX matches and is based on - derivatives \cite{verbatim}. In this case derivatives are not simplified. + of @{text "O(n\<^sup>2)"} is made for the Verbatim lexer, which calculates + POSIX matches and is based on + derivatives \cite{verbatim}. In this case derivatives are not even simplified. + They write: ``For a specific list of lexical rules, Verbatim has quadratic + theoretical time complexity with respect to the length of the input string.'' + While their correctness proof for Verbatim is formalised in Coq, the claim about + the runtime complexity is only supported by emperical evidence. + When we + tried out their extracted OCaml code with our example + \mbox{@{text "(a + aa)\<^sup>*"}}, it took around 5 minutes to tokenise a + string of 40 $a$'s and that increased to approximately 19 minutes when the + string was 50 $a$'s long. Given that derivatives are not simplified in + the work on Verbatim, such numbers are not surprising. Clearly our result of having finite - derivatives is rather weak in this context but we think such effeciency claims - require further scrutiny.\medskip + derivatives might sound rather weak in this context but we think such effeciency claims + really require further scrutiny.\medskip \noindent Our Isabelle/HOL code is available under \url{https://github.com/urbanchr/posix}. diff -r 30c91ea7095b -r 484403cf0c9d thys2/paper.pdf Binary file thys2/paper.pdf has changed