1420 \cite[Page 14]{Sulzmann2014}. |
1420 \cite[Page 14]{Sulzmann2014}. |
1421 Given the growth of the |
1421 Given the growth of the |
1422 derivatives in some cases even after aggressive simplification, this |
1422 derivatives in some cases even after aggressive simplification, this |
1423 is a hard to believe fact. A similar claim about a theoretical runtime |
1423 is a hard to believe fact. A similar claim about a theoretical runtime |
1424 of @{text "O(n\<^sup>2)"} is made for the Verbatim lexer, which calculates |
1424 of @{text "O(n\<^sup>2)"} is made for the Verbatim lexer, which calculates |
1425 POSIX matches and is based on |
1425 tokens according to POSIX rules \cite{verbatim}. For this it uses Brzozowski's |
1426 derivatives \cite{verbatim}. In this case derivatives are not even simplified. |
1426 derivatives . |
1427 They write: ``For a specific list of lexical rules, Verbatim has quadratic |
1427 They write: ``The results of our empirical tests [..] confirm that Verbatim has |
1428 theoretical time complexity with respect to the length of the input string.'' |
1428 $O(n^2)$ time complexity.'' \cite[Section~VII]{verbatim}. |
1429 While their correctness proof for Verbatim is formalised in Coq, the claim about |
1429 While their correctness proof for Verbatim is formalised in Coq, the claim about |
1430 the runtime complexity is only supported by emperical evidence. |
1430 the runtime complexity is only supported by some emperical evidence. |
1431 When we |
1431 In the context of our observation with the ``growth problem'' of derivatives, |
1432 tried out their extracted OCaml code with our example |
1432 we |
1433 \mbox{@{text "(a + aa)\<^sup>*"}}, it took around 5 minutes to tokenise a |
1433 tried out their extracted OCaml code with the example |
|
1434 \mbox{@{text "(a + aa)\<^sup>*"}} as a single lexing rule, and it took for us around 5 minutes to tokenise a |
1434 string of 40 $a$'s and that increased to approximately 19 minutes when the |
1435 string of 40 $a$'s and that increased to approximately 19 minutes when the |
1435 string was 50 $a$'s long. Given that derivatives are not simplified in |
1436 string is 50 $a$'s long. Given that derivatives are not simplified in the Verbatim |
1436 the work on Verbatim, such numbers are not surprising. |
1437 lexer, such numbers are not surprising. |
1437 Clearly our result of having finite |
1438 Clearly our result of having finite |
1438 derivatives might sound rather weak in this context but we think such effeciency claims |
1439 derivatives might sound rather weak in this context but we think such effeciency claims |
1439 really require further scrutiny.\medskip |
1440 really require further scrutiny.\medskip |
1440 |
1441 |
1441 \noindent |
1442 \noindent |