--- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Tue Aug 23 14:21:13 2022 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Tue Aug 23 22:59:49 2022 +0100
@@ -99,6 +99,39 @@
\end{tabular}
\end{center}
\noindent
+They suggested that the $\simp$ function should be
+applied repeatedly until a fixpoint is reached.
+We implemented the algorithm as required in Scala,
+and found that final derivative regular expression
+size grows exponentially fast:
+\begin{center}
+\begin{figure}[H]
+\begin{tikzpicture}
+\begin{axis}[
+ xlabel={$n$},
+ ylabel={time in secs},
+ ymode = log,
+ legend entries={Sulzmann and Lu's $\simp$},
+ legend pos=north west,
+ legend cell align=left]
+\addplot[red,mark=*, mark options={fill=white}] table {SulzmannLuLexer.data};
+\end{axis}
+\end{tikzpicture}
+\caption{Matching the regular expression $(a^*a^*)^*$ against strings of the form
+$\protect\underbrace{aa\ldots a}_\text{n \textit{a}s}
+$ using Sulzmann and Lu's lexer algorithm}\label{SulzmannLuLexer}
+\end{figure}
+\end{center}
+\noindent
+At $n= 20$ we already get an out of memory error with Scala's normal
+JVM heap size settings,
+which seems a counterexample for their linear complexity claim:
+\begin{quote}\it
+Linear-Time Complexity Claim It is easy to see that each call of one of the functions/operations\b,simp,fuse,mkEpsBCandisPhi leadstosubcallswhose number is bound by the size of the regular expression involved. We claim that thanks to aggressively applying simp this size remains finite. Hence, we can argue that the above mentioned functions/operations have constant time complexity which implies that we can incrementally compute bit-coded parse trees in linear time in the size of the input. We yet need to work out detailed estimates regarding the space complexity of our algorithm.
+\end{quote}
+
+
+
Despite that we have already implemented the simple-minded simplifications
such as throwing away useless $\ONE$s and $\ZERO$s.
The simplification rule $r + r \rightarrow $ cannot make a difference either
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/ChengsongTanPhdThesis/SulzmannLuLexer.data Tue Aug 23 22:59:49 2022 +0100
@@ -0,0 +1,19 @@
+1 15
+2 34
+3 72
+4 148
+5 300
+6 604
+7 1212
+8 2428
+9 4860
+10 9724
+11 19452
+12 38908
+13 77820
+14 155644
+15 311292
+16 622588
+17 1245180
+18 2490364
+19 4980732
\ No newline at end of file