diff -r 6e269f557fc5 -r c4b6906068a9 thys2/Paper/Paper.thy --- a/thys2/Paper/Paper.thy Tue Mar 22 10:29:07 2022 +0000 +++ b/thys2/Paper/Paper.thy Tue Mar 22 11:08:43 2022 +0000 @@ -229,6 +229,7 @@ not match any string, @{const ONE} for the regular expression that matches only the empty string and @{term c} for matching a character literal. The constructors $+$ and $\cdot$ represent alternatives and sequences, respectively. + We sometimes omit the $\cdot$ in a sequence regular expression for brevity. The \emph{language} of a regular expression, written $L$, is defined as usual and we omit giving the definition here (see for example \cite{AusafDyckhoffUrban2016}). @@ -1049,7 +1050,7 @@ add @{text "f x"} as another ``seen'' element to @{text acc}. We will use @{term distinctBy} where @{text f} is the erase function, @{term "erase (DUMMY)"}, that deletes bitsequences from bitcoded regular expressions. - This is clearly a computationally more expensive operation, than @{text nub}, + This is clearly a computationally more expensive operation than @{text nub}, but is needed in order to make the removal of unnecessary copies to work properly. @@ -1308,9 +1309,9 @@ \end{center} \caption{The rewrite rules that generate simplified regular expressions in small steps: @{term "rrewrite r\<^sub>1 r\<^sub>2"} is for bitcoded regular - expressions and @{term "rrewrites rs\<^sub>1 rs\<^sub>2"} for \emph{lists} of bitcoded + expressions and @{term "srewrite rs\<^sub>1 rs\<^sub>2"} for \emph{lists} of bitcoded regular expressions. Interesting is the $LD$ rule that allows copies of regular - expressions be removed provided a regular expression earlier in the list can + expressions to be removed provided a regular expression earlier in the list can match the same strings.}\label{SimpRewrites} \end{figure} *} @@ -1422,12 +1423,13 @@ derivatives in some cases even after aggressive simplification, this is a hard to believe fact. A similar claim about a theoretical runtime of @{text "O(n\<^sup>2)"} is made for the Verbatim lexer, which calculates - tokens according to POSIX rules \cite{verbatim}. For this it uses Brzozowski's - derivatives . + tokens according to POSIX rules~\cite{verbatim}. For this it uses Brzozowski's + derivatives. They write: ``The results of our empirical tests [..] confirm that Verbatim has - $O(n^2)$ time complexity.'' \cite[Section~VII]{verbatim}. + @{text "O(n\<^sup>2)"} time complexity.'' \cite[Section~VII]{verbatim}. While their correctness proof for Verbatim is formalised in Coq, the claim about - the runtime complexity is only supported by some emperical evidence. + the runtime complexity is only supported by some emperical evidence obtained + by using the code extraction facilities of Coq. In the context of our observation with the ``growth problem'' of derivatives, we tried out their extracted OCaml code with the example