--- a/ChengsongTanPhdThesis/Chapters/Bitcoded1.tex	Mon Aug 15 17:26:08 2022 +0200
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded1.tex	Wed Aug 17 01:09:13 2022 +0100
@@ -11,11 +11,299 @@
 In this chapter, we are going to describe the bit-coded algorithm
 introduced by Sulzmann and Lu \parencite{Sulzmann2014} to address the growth problem of 
 regular expressions. 
+We have implemented their algorithm in Scala, and found out inefficiencies
+in their algorithm such as de-duplication not working properly and redundant
+fixpoint construction. Their algorithm is improved and verified with the help of
+formal proofs.
 \section{Bit-coded Algorithm}
-The lexer algorithm in Chapter \ref{Inj}, as shown in \ref{InjFigure},
+We first do a recap of what was going on 
+in the lexer algorithm in Chapter \ref{Inj},
+\begin{center}
+\begin{tabular}{lcl}
+	$\lexer \; r \; [] $ & $=$ & $\textit{if} \; (\nullable \; r)\; \textit{then}\;  \Some(\mkeps \; r) \; \textit{else} \; \None$\\
+	$\lexer \; r \;c::s$ & $=$ & $\textit{case}\; (\lexer \; (r\backslash c) \; s) \;\textit{of}\; $\\
+	& & $\quad \phantom{\mid}\; \None \implies \None$\\
+	& & $\quad \mid           \Some(v) \implies \Some(\inj \; r\; c\; v)$
+\end{tabular}
+\end{center}
+\noindent
+The algorithm recursively calls $\lexer$ on 
+each new character input,
+and before starting a child call
+it stores information of previous lexing steps
+on a stack, in the form of regular expressions
+and characters: $r_0$, $c_0$, $r_1$, $c_1$, etc.
+Each descent into deeper recursive calls in $\lexer$
+causes a new pair of $r_i, c_i$ to be pushed to the call stack.
+\begin{figure}
+\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
+%\draw (-6,-6) grid (6,6);
+\node  [ circle ] (r) at (-6, 5) {$r$};
+
+%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
+\node  [circle, minimum size = 0.1, draw] (c1) at (-4, 5.4) {$c_1$};
+%
+%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
+\node  [minimum size = 0.5, circle, draw] (r1) at (-2, 5) {$r_1$};
+
+\node [minimum width = 2cm, rectangle, draw] (stack) at (0, 3) {Stack};
+
+\path
+	(r)
+        edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
+
+\path   (r1)
+	edge [bend right, dashed] node {saved} (stack);
+\path   (c1)
+	edge [bend right, dashed] node {} (stack);
+
+
+\end{tikzpicture}
+\caption{First Derivative Taken}
+\end{figure}
+
+
+
+\begin{figure}
+\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
+%\draw (-6,-6) grid (6,6);
+\node  [ circle ] (r) at (-6, 5) {$r$};
+
+%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
+\node  [circle, minimum size = 0.1, ] (c1) at (-4, 5.4) {$c_1$};
+%
+%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
+\node  [minimum size = 0.5, circle, ] (r1) at (-2, 5) {$r_1$};
+
+
+\node [circle, minimum size = 0.1, draw] (c2) at (0, 5.4) {$c_2$};
+%
+%\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
+\node [circle, draw] (r2) at (2, 5) {$r_2$};
+\node [minimum width = 3cm, minimum height = 1cm, rectangle, draw] (stack) at (0, 2) {\large Stack};
+
+\path
+	(r)
+        edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
+
+\path   (r2)
+	edge [bend right, dashed] node {} (stack);
+\path   (c2)
+	edge [bend right, dashed] node {} (stack);
+
+\path   (r1)
+	edge [] node {} (r2);
+
+\end{tikzpicture}
+\caption{Second Derivative Taken}
+\end{figure}
+\noindent
+As the number of derivative steps increase,
+the stack would increase:
+
+\begin{figure}
+\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
+%\draw (-6,-6) grid (6,6);
+\node  [ circle ] (r) at (-6, 5) {$r$};
+
+%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
+\node  [circle, minimum size = 0.1, ] (c1) at (-4, 5.4) {$c_1$};
+%
+%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
+\node  [minimum size = 0.5, circle, ] (r1) at (-2, 5) {$r_1$};
+
+
+\node [circle, minimum size = 0.1, ] (c2) at (0, 5.4) {$c_2$};
+%
+%\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
+\node [circle, ] (r2) at (2, 5) {$r_2$};
+\node [minimum width = 4cm, minimum height = 2.5cm, rectangle, draw] (stack) at (0, 1) { \large Stack};
+
+\node [] (ldots) at (3.5, 5) {};
+%\node (6, 5) (rn) circle [radius = 0.5] {$r_n$};
+
+\node [minimum size = 0.5, circle, ] (rn) at (6, 5) {};
+
+\node (rldots) at ($(ldots)!.4!(rn)$) {\ldots};
+
+\path
+	(r)
+        edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
+
+\path   (rldots)
+	edge [bend left, dashed] node {} (stack);
+
+\path   (r1)
+	edge [] node {} (r2);
+
+\path   (r2)
+	edge [] node {} (ldots);
+\path   (ldots)
+	edge [bend left, dashed] node {} (stack);
+\path   (5.03, 4.9)
+	edge [bend left, dashed] node {} (stack);
+\end{tikzpicture}
+\caption{More Derivatives Taken}
+\end{figure}
+\noindent
+After all derivatives have been taken, the stack grows to a maximum size
+and the pair of regular expressions and characters $r_i, c_{i+1}$ 
+are then popped out and used in the injection phase:
+
+\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
+%\draw (-6,-6) grid (6,6);
+\node  [radius = 0.5, circle, draw] (r) at (-6, 5) {$r$};
+
+%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
+\node  [circle, minimum size = 0.1, draw] (c1) at (-4, 5.4) {$c_1$};
+%
+%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
+\node  [minimum size = 0.5, circle, draw] (r1) at (-2, 5) {$r_1$};
+%
+%\node (0, 6)  (c2) circle [radius = 0.3] {$c_2$};
+\node [circle, minimum size = 0.1, draw] (c2) at (0, 5.4) {$c_2$};
+%
+%\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
+\node [circle, draw] (r2) at (2, 5) {$r_2$};
+%
+%
+\node [] (ldots) at (4.5, 5) {};
+%\node (6, 5) (rn) circle [radius = 0.5] {$r_n$};
+
+\node [minimum size = 0.5, circle, draw] (rn) at (6, 5) {$r_n$};
+
+\node at ($(ldots)!.4!(rn)$) {\ldots};
+
+\node [minimum size = 0.5, circle, draw] (vn) at (6, -5) {$v_n$};
+
+\node [] (ldots2) at (3.5, -5) {};
+
+\node  [minimum size = 0.5, circle, draw] (v2) at (2, -5) {$v_2$};
+
+\node at ($(ldots2)!.4!(v2)$) {\ldots};
+
+
+\node [circle, draw] (v1) at (-2, -5) {$v_1$};
+
+\node  [radius = 0.5, circle, draw] (v) at (-6, -5) {$v$};
+
+\node [minimum size = 6cm, rectangle, draw] (stack) at (0, 0) {\Huge Stack};
+
+\path
+	(r)
+        edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
+\path
+	(r1)
+        edge [] node {} (r2);
+\path   (r2)
+	edge [] node {} (ldots);
+\path   (rn)
+	edge [] node {$\mkeps$} (vn);
+\path   (vn) 
+	edge [] node {} (ldots2);
+\path   (v2)
+	edge [] node {$\inj \; r_1 \; c_2\;v_2$} (v1);
+
+\path   (v1)
+	edge [] node {$\inj \; r \; c_1 \; v_1$} (v);
+\path   (r)
+	edge [dashed, bend right] node {} (stack);
+
+\path   (r1)
+	edge [dashed, ] node {} (stack);
+
+\path   (c1)
+	edge [dashed, bend right] node {} (stack);
+
+\path   (c2)
+	edge [dashed] node {} (stack);
+\path   (4.5, 5)
+	edge [dashed, bend left] node {} (stack);
+\path   (4.9, 5)
+	edge [dashed, bend left] node {} (stack);
+\path   (5.3, 5)
+	edge [dashed, bend left] node {} (stack);
+\path (r2)
+	edge [dashed, ] node {} (stack);
+\path (rn)
+	edge [dashed, bend left] node {} (stack);
+\end{tikzpicture}
+%\begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,thick] 
+%%\draw (-6,-6) grid (6,6);
+%\node  [radius = 0.5, circle, draw] (r) at (-6, 5) {$r$};
+%
+%%\node (-4, 6) (c1) circle [radius = 0.3] {$c_1$};
+%\node  [circle, minimum size = 0.1] (c1) at (-4, 5.4) {$c_1$};
+%%
+%%\node (-2, 5) (r1) circle [radius = 0.5] {$r_1$};
+%\node  [minimum size = 0.5, circle, draw] (r1) at (-2, 5) {$r_1$};
+%%
+%%\node (0, 6)  (c2) circle [radius = 0.3] {$c_2$};
+%\node [circle, minimum size = 0.1] (c2) at (0, 5.4) {$c_2$};
+%%
+%%\node (2, 5) (r2) circle [radius = 0.5] {$r_2$};
+%\node [circle, draw] (r2) at (2, 5) {$r_2$};
+%%
+%%
+%\node [] (ldots) at (4.5, 5) {};
+%%\node (6, 5) (rn) circle [radius = 0.5] {$r_n$};
+%
+%\node [minimum size = 0.5, circle, draw] (rn) at (6, 5) {$r_n$};
+%
+%\node at ($(ldots)!.4!(rn)$) {\ldots};
+%
+%\node [minimum size = 0.5, circle, draw] (vn) at (6, -5) {$v_n$};
+%
+%\node [] (ldots2) at (3.5, -5) {};
+%
+%\node  [minimum size = 0.5, circle, draw] (v2) at (2, -5) {$v_2$};
+%
+%\node at ($(ldots2)!.4!(v2)$) {\ldots};
+%
+%
+%\node [circle, draw] (v1) at (-2, -5) {$v_1$};
+%
+%\node  [radius = 0.5, circle, draw] (v) at (-6, -5) {$v$};
+%
+%\node [minimum size = 6cm, rectangle, draw] (stack) at (0, 0) {\Huge Stack};
+%
+%\path
+%	(r)
+%        edge [->, >=stealth',shorten >=1pt] node[left] {} (r1);
+%\path
+%	(r1)
+%        edge [] node {} (r2);
+%\path   (r2)
+%	edge [] node {} (ldots);
+%\path   (rn)
+%	edge [] node {$\mkeps$} (vn);
+%\path   (vn) 
+%	edge [] node {} (ldots2);
+%\path   (v2)
+%	edge [] node {} (v1);
+%
+%\path   (v1)
+%	edge [] node {} (v);
+%\path   (r)
+%	edge [] node {saved} (stack);
+%
+%\path   (r1)
+%	edge [] node {saved} (stack);
+%\end{tikzpicture}
+\noindent
+The information stored in characters and regular expressions
+make the algorithm work in an elegant way, at the expense of being
+storing quite a bit of verbose information.
+
+
+
+The lexer algorithm in Chapter \ref{Inj},
 stores information of previous lexing steps
 on a stack, in the form of regular expressions
 and characters: $r_0$, $c_0$, $r_1$, $c_1$, etc.
+The red part represents what we already know during the first
+derivative phase,
+and the blue part represents the unknown part of input.
 \begin{ceqn}
 \begin{equation}%\label{graph:injLexer}
 	\begin{tikzcd}[ampersand replacement=\&, execute at end picture={
@@ -39,22 +327,19 @@
 \end{equation}
 \end{ceqn}
 \noindent
-The red part represents what we already know during the first
-derivative phase,
-and the blue part represents the unknown part of input.
 The red area expands as we move towards $r_n$, 
-indicating an increasing stack size during lexing.
-Despite having some partial lexing information during
+indicating more is known about the lexing result.
+Despite knowing this partial lexing information during
 the forward derivative phase, we choose to store them
-temporarily, only to convert the information to lexical
-values at a later stage. In essence we are repeating work we
-have already done.
-This is both inefficient and prone to stack overflow.
-A natural question arises as to whether we can store lexing
+all the way until $r_n$ is reached.
+Then we reconstruct the value character by character
+values at a later stage, using information in a Last-In-First-Out
+manner. Although the algorithm is elegant and natural,
+it can be inefficient and prone to stack overflow.\\
+It turns out we can store lexing
 information on the fly, while still using regular expression 
 derivatives.
-
-If we remove the details of the individual 
+If we remove the individual 
 lexing steps, and use red and blue areas as before
 to indicate consumed (seen) input and constructed
 partial value (before recovering the rest of the stack),
changeset 579	35df9cdd36ca
parent 576	3e1b699696b6
child 580	e0f0a81f907b