--- a/ChengsongTanPhdThesis/Chapters/Bitcoded1.tex Mon Aug 15 17:26:08 2022 +0200
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded1.tex Wed Aug 17 01:09:13 2022 +0100
@@ -11,11 +11,299 @@
In this chapter, we are going to describe the bit-coded algorithm
introduced by Sulzmann and Lu \parencite{Sulzmann2014} to address the growth problem of
regular expressions.
+We have implemented their algorithm in Scala, and found out inefficiencies
+in their algorithm such as de-duplication not working properly and redundant
+fixpoint construction. Their algorithm is improved and verified with the help of
+formal proofs.
\section{Bit-coded Algorithm}
-The lexer algorithm in Chapter \ref{Inj}, as shown in \ref{InjFigure},
+We first do a recap of what was going on
+in the lexer algorithm in Chapter \ref{Inj},
+\begin{center}
+\begin{tabular}{lcl}
+ $\lexer \; r \; [] $ & $=$ & $\textit{if} \; (\nullable \; r)\; \textit{then}\; \Some(\mkeps \; r) \; \textit{else} \; \None$\\
+ $\lexer \; r \;c::s$ & $=$ & $\textit{case}\; (\lexer \; (r\backslash c) \; s) \;\textit{of}\; $\\
+ & & $\quad \phantom{\mid}\; \None \implies \None$\\
+ & & $\quad \mid \Some(v) \implies \Some(\inj \; r\; c\; v)$
+\end{tabular}
+\end{center}
+\noindent
+The algorithm recursively calls $\lexer$ on
+each new character input,
+and before starting a child call
+it stores information of previous lexing steps
+on a stack, in the form of regular expressions
+and characters: $r_0$, $c_0$, $r_1$, $c_1$, etc.
+Each descent into deeper recursive calls in $\lexer$
+causes a new pair of $r_i, c_i$ to be pushed to the call stack.
+\begin{figure}
+\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick]
+%\draw (-6,-6) grid (6,6);
+\node [ circle ] (r) at (-6, 5) {$r$};
+
+%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
+\node [circle, minimum size = 0.1, draw] (c1) at (-4, 5.4) {$c_1$};
+%
+%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
+\node [minimum size = 0.5, circle, draw] (r1) at (-2, 5) {$r_1$};
+
+\node [minimum width = 2cm, rectangle, draw] (stack) at (0, 3) {Stack};
+
+\path
+ (r)
+ edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
+
+\path (r1)
+ edge [bend right, dashed] node {saved} (stack);
+\path (c1)
+ edge [bend right, dashed] node {} (stack);
+
+
+\end{tikzpicture}
+\caption{First Derivative Taken}
+\end{figure}
+
+
+
+\begin{figure}
+\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick]
+%\draw (-6,-6) grid (6,6);
+\node [ circle ] (r) at (-6, 5) {$r$};
+
+%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
+\node [circle, minimum size = 0.1, ] (c1) at (-4, 5.4) {$c_1$};
+%
+%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
+\node [minimum size = 0.5, circle, ] (r1) at (-2, 5) {$r_1$};
+
+
+\node [circle, minimum size = 0.1, draw] (c2) at (0, 5.4) {$c_2$};
+%
+%\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
+\node [circle, draw] (r2) at (2, 5) {$r_2$};
+\node [minimum width = 3cm, minimum height = 1cm, rectangle, draw] (stack) at (0, 2) {\large Stack};
+
+\path
+ (r)
+ edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
+
+\path (r2)
+ edge [bend right, dashed] node {} (stack);
+\path (c2)
+ edge [bend right, dashed] node {} (stack);
+
+\path (r1)
+ edge [] node {} (r2);
+
+\end{tikzpicture}
+\caption{Second Derivative Taken}
+\end{figure}
+\noindent
+As the number of derivative steps increase,
+the stack would increase:
+
+\begin{figure}
+\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick]
+%\draw (-6,-6) grid (6,6);
+\node [ circle ] (r) at (-6, 5) {$r$};
+
+%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
+\node [circle, minimum size = 0.1, ] (c1) at (-4, 5.4) {$c_1$};
+%
+%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
+\node [minimum size = 0.5, circle, ] (r1) at (-2, 5) {$r_1$};
+
+
+\node [circle, minimum size = 0.1, ] (c2) at (0, 5.4) {$c_2$};
+%
+%\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
+\node [circle, ] (r2) at (2, 5) {$r_2$};
+\node [minimum width = 4cm, minimum height = 2.5cm, rectangle, draw] (stack) at (0, 1) { \large Stack};
+
+\node [] (ldots) at (3.5, 5) {};
+%\node (6, 5) (rn) circle [radius = 0.5] {$r_n$};
+
+\node [minimum size = 0.5, circle, ] (rn) at (6, 5) {};
+
+\node (rldots) at ($(ldots)!.4!(rn)$) {\ldots};
+
+\path
+ (r)
+ edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
+
+\path (rldots)
+ edge [bend left, dashed] node {} (stack);
+
+\path (r1)
+ edge [] node {} (r2);
+
+\path (r2)
+ edge [] node {} (ldots);
+\path (ldots)
+ edge [bend left, dashed] node {} (stack);
+\path (5.03, 4.9)
+ edge [bend left, dashed] node {} (stack);
+\end{tikzpicture}
+\caption{More Derivatives Taken}
+\end{figure}
+\noindent
+After all derivatives have been taken, the stack grows to a maximum size
+and the pair of regular expressions and characters $r_i, c_{i+1}$
+are then popped out and used in the injection phase:
+
+\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick]
+%\draw (-6,-6) grid (6,6);
+\node [radius = 0.5, circle, draw] (r) at (-6, 5) {$r$};
+
+%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
+\node [circle, minimum size = 0.1, draw] (c1) at (-4, 5.4) {$c_1$};
+%
+%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
+\node [minimum size = 0.5, circle, draw] (r1) at (-2, 5) {$r_1$};
+%
+%\node (0, 6) (c2) circle [radius = 0.3] {$c_2$};
+\node [circle, minimum size = 0.1, draw] (c2) at (0, 5.4) {$c_2$};
+%
+%\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
+\node [circle, draw] (r2) at (2, 5) {$r_2$};
+%
+%
+\node [] (ldots) at (4.5, 5) {};
+%\node (6, 5) (rn) circle [radius = 0.5] {$r_n$};
+
+\node [minimum size = 0.5, circle, draw] (rn) at (6, 5) {$r_n$};
+
+\node at ($(ldots)!.4!(rn)$) {\ldots};
+
+\node [minimum size = 0.5, circle, draw] (vn) at (6, -5) {$v_n$};
+
+\node [] (ldots2) at (3.5, -5) {};
+
+\node [minimum size = 0.5, circle, draw] (v2) at (2, -5) {$v_2$};
+
+\node at ($(ldots2)!.4!(v2)$) {\ldots};
+
+
+\node [circle, draw] (v1) at (-2, -5) {$v_1$};
+
+\node [radius = 0.5, circle, draw] (v) at (-6, -5) {$v$};
+
+\node [minimum size = 6cm, rectangle, draw] (stack) at (0, 0) {\Huge Stack};
+
+\path
+ (r)
+ edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
+\path
+ (r1)
+ edge [] node {} (r2);
+\path (r2)
+ edge [] node {} (ldots);
+\path (rn)
+ edge [] node {$\mkeps$} (vn);
+\path (vn)
+ edge [] node {} (ldots2);
+\path (v2)
+ edge [] node {$\inj \; r_1 \; c_2\;v_2$} (v1);
+
+\path (v1)
+ edge [] node {$\inj \; r \; c_1 \; v_1$} (v);
+\path (r)
+ edge [dashed, bend right] node {} (stack);
+
+\path (r1)
+ edge [dashed, ] node {} (stack);
+
+\path (c1)
+ edge [dashed, bend right] node {} (stack);
+
+\path (c2)
+ edge [dashed] node {} (stack);
+\path (4.5, 5)
+ edge [dashed, bend left] node {} (stack);
+\path (4.9, 5)
+ edge [dashed, bend left] node {} (stack);
+\path (5.3, 5)
+ edge [dashed, bend left] node {} (stack);
+\path (r2)
+ edge [dashed, ] node {} (stack);
+\path (rn)
+ edge [dashed, bend left] node {} (stack);
+\end{tikzpicture}
+%\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick]
+%%\draw (-6,-6) grid (6,6);
+%\node [radius = 0.5, circle, draw] (r) at (-6, 5) {$r$};
+%
+%%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
+%\node [circle, minimum size = 0.1] (c1) at (-4, 5.4) {$c_1$};
+%%
+%%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
+%\node [minimum size = 0.5, circle, draw] (r1) at (-2, 5) {$r_1$};
+%%
+%%\node (0, 6) (c2) circle [radius = 0.3] {$c_2$};
+%\node [circle, minimum size = 0.1] (c2) at (0, 5.4) {$c_2$};
+%%
+%%\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
+%\node [circle, draw] (r2) at (2, 5) {$r_2$};
+%%
+%%
+%\node [] (ldots) at (4.5, 5) {};
+%%\node (6, 5) (rn) circle [radius = 0.5] {$r_n$};
+%
+%\node [minimum size = 0.5, circle, draw] (rn) at (6, 5) {$r_n$};
+%
+%\node at ($(ldots)!.4!(rn)$) {\ldots};
+%
+%\node [minimum size = 0.5, circle, draw] (vn) at (6, -5) {$v_n$};
+%
+%\node [] (ldots2) at (3.5, -5) {};
+%
+%\node [minimum size = 0.5, circle, draw] (v2) at (2, -5) {$v_2$};
+%
+%\node at ($(ldots2)!.4!(v2)$) {\ldots};
+%
+%
+%\node [circle, draw] (v1) at (-2, -5) {$v_1$};
+%
+%\node [radius = 0.5, circle, draw] (v) at (-6, -5) {$v$};
+%
+%\node [minimum size = 6cm, rectangle, draw] (stack) at (0, 0) {\Huge Stack};
+%
+%\path
+% (r)
+% edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
+%\path
+% (r1)
+% edge [] node {} (r2);
+%\path (r2)
+% edge [] node {} (ldots);
+%\path (rn)
+% edge [] node {$\mkeps$} (vn);
+%\path (vn)
+% edge [] node {} (ldots2);
+%\path (v2)
+% edge [] node {} (v1);
+%
+%\path (v1)
+% edge [] node {} (v);
+%\path (r)
+% edge [] node {saved} (stack);
+%
+%\path (r1)
+% edge [] node {saved} (stack);
+%\end{tikzpicture}
+\noindent
+The information stored in characters and regular expressions
+make the algorithm work in an elegant way, at the expense of being
+storing quite a bit of verbose information.
+
+
+
+The lexer algorithm in Chapter \ref{Inj},
stores information of previous lexing steps
on a stack, in the form of regular expressions
and characters: $r_0$, $c_0$, $r_1$, $c_1$, etc.
+The red part represents what we already know during the first
+derivative phase,
+and the blue part represents the unknown part of input.
\begin{ceqn}
\begin{equation}%\label{graph:injLexer}
\begin{tikzcd}[ampersand replacement=\&, execute at end picture={
@@ -39,22 +327,19 @@
\end{equation}
\end{ceqn}
\noindent
-The red part represents what we already know during the first
-derivative phase,
-and the blue part represents the unknown part of input.
The red area expands as we move towards $r_n$,
-indicating an increasing stack size during lexing.
-Despite having some partial lexing information during
+indicating more is known about the lexing result.
+Despite knowing this partial lexing information during
the forward derivative phase, we choose to store them
-temporarily, only to convert the information to lexical
-values at a later stage. In essence we are repeating work we
-have already done.
-This is both inefficient and prone to stack overflow.
-A natural question arises as to whether we can store lexing
+all the way until $r_n$ is reached.
+Then we reconstruct the value character by character
+values at a later stage, using information in a Last-In-First-Out
+manner. Although the algorithm is elegant and natural,
+it can be inefficient and prone to stack overflow.\\
+It turns out we can store lexing
information on the fly, while still using regular expression
derivatives.
-
-If we remove the details of the individual
+If we remove the individual
lexing steps, and use red and blue areas as before
to indicate consumed (seen) input and constructed
partial value (before recovering the rest of the stack),