diff -r 692911c0b981 -r 3178f0e948ac ChengsongTanPhdThesis/Chapters/Bitcoded1.tex
--- a/ChengsongTanPhdThesis/Chapters/Bitcoded1.tex	Thu Jul 21 20:22:35 2022 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded1.tex	Tue Aug 02 22:11:22 2022 +0100
@@ -178,11 +178,6 @@
 
 Because of the lossiness, the process of decoding a bitlist requires additionally 
 a regular expression. The function $\decode$ is defined as:
-We define the reverse operation of $\code$, which is $\decode$.
-As expected, $\decode$ not only requires the bit-codes,
-but also a regular expression to guide the decoding and 
-fill the gaps of characters:
-
 
 %\begin{definition}[Bitdecoding of Values]\mbox{}
 \begin{center}
@@ -214,27 +209,35 @@
 %\end{definition}
 
 \noindent
-The function $\decode'$ returns a pair consisting of a partially decoded value and some leftover:
-$\decode'$ does most of the job while $\decode$ throws
-away leftover bit-codes and returns the value only.
+The function $\decode'$ returns a pair consisting of 
+a partially decoded value and some leftover bit list that cannot
+be decide yet.
+The function $\decode'$ succeeds if the left-over 
+bit-sequence is empty.
 $\decode$ is terminating as $\decode'$ is terminating.
-We have the property that $\decode$ and $\code$ are
+$\decode'$ is terminating 
+because at least one of $\decode'$'s parameters will go down in terms
+of size.
+Assuming we have a value $v$ and regular expression $r$
+with $\vdash v:r$,
+then we have the property that $\decode$ and $\code$ are
 reverse operations of one another:
 \begin{lemma}
-\[\vdash v : r \implies \decode \; (\code \; v) \; r = \textit{Some}(v) \]
+\[If \vdash v : r \; then \;\decode \; (\code \; v) \; r = \textit{Some}(v) \]
 \end{lemma}
 \begin{proof}
 By proving a more general version of the lemma, on $\decode'$:
 \[\vdash v : r \implies \decode' \; ((\code \; v) @ ds) \; r = (v, ds) \]
-Then setting $ds$ to be $[]$ and unfolding $\decode$ definition
-we get the lemma.
+Then setting $ds$ to be $[]$ and unfolding $\decode$ definition,
+we obtain the property.
 \end{proof}
 With the $\code$ and $\decode$ functions in hand, we know how to 
-switch between bit-codes and value--the two different representations of 
-lexing information. 
-The next step is to integrate this information into the working regular expression.
+switch between bit-codes and values. 
+The next step is to integrate this information into regular expression.
 Attaching bits to the front of regular expressions is the solution Sulzamann and Lu
-gave for storing partial values on the fly:
+gave for storing partial values in regular expressions. 
+Annotated regular expressions are therefore defined as the Isabelle
+datatype:
 
 \begin{center}
 \begin{tabular}{lcl}
@@ -249,16 +252,18 @@
 %(in \textit{ALTS})
 
 \noindent
-We call these regular expressions carrying bit-codes \emph{Annotated regular expressions}.
-$bs$ stands for bit-codes, $a$  for $\mathbf{a}$nnotated regular
-expressions and $as$ for a list of annotated regular expressions.
-The alternative constructor ($\sum$) has been generalised to 
-accept a list of annotated regular expressions rather than just 2.
+where $bs$ stands for bit-codes, $a$  for $\mathbf{a}$nnotated regular
+expressions and $as$ for lists of annotated regular expressions.
+The alternative constructor, written, $\sum$, has been generalised to 
+accept a list of annotated regular expressions rather than just two.
+Why is it generalised? This is because when we open up nested 
+alternatives, there could be more than two elements at the same level
+after de-duplication, which can no longer be stored in a binary
+constructor.
 
-
-The first thing we define related to bit-coded regular expressions
-is how we move bits, for instance pasting it at the front of an annotated regular expression.
-The operation $\fuse$ is just to attach bit-codes 
+The first operation we define related to bit-coded regular expressions
+is how we move bits to the inside of regular expressions.
+Called $\fuse$, this operation is attaches bit-codes 
 to the front of an annotated regular expression:
 \begin{center}
 \begin{tabular}{lcl}
@@ -277,14 +282,12 @@
 \end{center} 
 
 \noindent
-With that we are able to define $\internalise$.
+With \emph{fuse} we are able to define the $\internalise$ function
+that translates a ``standard'' regular expression into an
+annotated regular expression.
+This function will be applied before we start
+with the derivative phase of the algorithm.
 
-To do lexing using annotated regular expressions, we shall first
-transform the usual (un-annotated) regular expressions into annotated
-regular expressions. This operation is called \emph{internalisation} and
-defined as follows:
-
-%\begin{definition}
 \begin{center}
 \begin{tabular}{lcl}
   $(\ZERO)^\uparrow$ & $\dn$ & $\ZERO$\\
@@ -303,7 +306,7 @@
 
 \noindent
 We use an up arrow with postfix notation
-to denote the operation, 
+to denote this operation.
 for convenience. The $\textit{internalise} \; r$
 notation is more cumbersome.
 The opposite of $\textit{internalise}$ is
@@ -312,16 +315,17 @@
 regular expressions is transformed to the binary alternatives
 for plain regular expressions.
 \begin{center}
-	\begin{tabular}{lcr}
-		$\erase \; \ZERO$ & $\dn$ & $\ZERO$\\
-		$\erase \; _{bs}\ONE$ & $\dn$ & $\ONE$\\
-		$\erase \; _{bs}\mathbf{c}$ & $\dn$ & $\mathbf{c}$\\
-		$\erase \; _{bs} a_1 \cdot a_2$ & $\dn$ & $\erase \; r_1\cdot\erase\; r_2$\\
-		$\erase \; _{bs} [] $ & $\dn$ & $\ZERO$\\
-		$\erase \; _{bs} [a] $ & $\dn$ & $\erase \; a$\\
-		$\erase \; _{bs} \sum [a_1, \; a_2]$ & $\dn$ & $\erase \; a_1 +\erase \; a_2$\\
-		$\erase \; _{bs} \sum (a :: as)$ & $\dn$ & $\erase \; a + \erase \; _{[]} \sum as$\\
-		$\erase \; _{bs} a^*$ & $\dn$ & $(\erase \; a)^*$
+	\begin{tabular}{lcl}
+		$\ZERO_\downarrow$ & $\dn$ & $\ZERO$\\
+		$( _{bs}\ONE )_\downarrow$ & $\dn$ & $\ONE$\\
+		$( _{bs}\mathbf{c} )_\downarrow$ & $\dn$ & $\mathbf{c}$\\
+		$( _{bs} a_1 \cdot a_2 )_\downarrow$ & $\dn$ & 
+		$ (a_1) _\downarrow \cdot  (a_2) _\downarrow$\\
+		$( _{bs} [])_\downarrow $ & $\dn$ & $\ZERO $\\
+		$( _{bs} [a]  )_\downarrow$ & $\dn$ & $a_\downarrow$\\
+		$_{bs} \sum [a_1, \; a_2]$ & $\dn$ & $ (a_1) _\downarrow + ( a_2 ) _\downarrow $\\
+		$(_{bs} \sum (a :: as))_\downarrow$ & $\dn$ & $ a_\downarrow + \; (_{[]} \sum as)_\downarrow$\\
+		$( _{bs} a^* )_\downarrow$ & $\dn$ & $(a_\downarrow)^*$
 	\end{tabular}
 \end{center}
 \noindent
@@ -331,13 +335,12 @@
 testing whether they contain empty string in their lauguage requires
 a dedicated function $\bnullable$
 which simply calls $\erase$ first before testing whether it is $\nullable$.
-\begin{center}
-	\begin{tabular}{lcr}
-		$\bnullable \; a $ & $\dn$ & $\nullable \; (a_\downarrow)$
-	\end{tabular}
-\end{center}
+\begin{definition}
+		$\bnullable \; a \dn  \nullable \; (a_\downarrow)$
+\end{definition}
 The function for collecting the
-bitcodes of a $\bnullable$ annotated regular expression
+bitcodes at the end of the derivative 
+phase from a (b)nullable regular expression
 is a generalised version of the $\textit{mkeps}$ function
 for annotated regular expressions, called $\textit{bmkeps}$:
 
@@ -359,74 +362,105 @@
 %\end{definition}
 
 \noindent
-This function completes the value information by travelling along the
+$\bmkeps$ completes the value information by travelling along the
 path of the regular expression that corresponds to a POSIX value and
-collecting all the bitcodes, and using $S$ to indicate the end of star
+collecting all the bitcodes, and attaching $S$ to indicate the end of star
 iterations. 
 
-The most central question is how these partial lexing information
-represented as bit-codes is augmented and carried around 
-during a derivative is taken.
-This is done by adding bitcodes to the 
-derivatives, for example when one more star iteratoin is taken (we
-call the operation of derivatives on annotated regular expressions $\bder$
-because it is derivatives on regular expressiones with \emph{b}itcodes),
+Now we give out the central part of this lexing algorithm,
+the $\bder$ function (stands for \emph{b}itcoded-derivative).
+For most time we use the infix notation $(\_\backslash\_)$ 
+to mean $\bder$ for brevity when
+there is no danger of confusion with derivatives on plain regular expressions.
+For example, we write $( _{[]}r^* ) \backslash c$ instead of $\bder \;c \; _{[]}r^*$,
+as the bitcodes at the front of $r^*$ indicates that it is 
+a bit-coded regular expression, not a plain one.
+$\bder$ tells us how regular expressions can be recursively traversed,
+where the bitcodes are augmented and carried around 
+when a derivative is taken.
+\begin{center}
+  \begin{tabular}{@{}lcl@{}}
+  $(\ZERO)\,\backslash c$ & $\dn$ & $\ZERO$\\  
+  $(_{bs}\ONE)\,\backslash c$ & $\dn$ & $\ZERO$\\  
+  $(_{bs}{\bf d})\,\backslash c$ & $\dn$ &
+        $\textit{if}\;c=d\; \;\textit{then}\;
+         _{bs}\ONE\;\textit{else}\;\ZERO$\\  
+  $(_{bs}\sum \;\textit{as})\,\backslash c$ & $\dn$ &
+  $_{bs}\sum\;(\textit{map} \; (\_\backslash c) \; as )$\\
+  $(_{bs}\;a_1\cdot a_2)\,\backslash c$ & $\dn$ &
+     $\textit{if}\;\textit{bnullable}\,a_1$\\
+					       & &$\textit{then}\;_{bs}\sum\,[(_{[]}\,(a_1\,\backslash c)\cdot\,a_2),$\\
+					       & &$\phantom{\textit{then},\;_{bs}\sum\,}(\textit{fuse}\,(\textit{bmkeps}\,a_1)\,(a_2\,\backslash c))]$\\
+  & &$\textit{else}\;_{bs}\,(a_1\,\backslash c)\cdot a_2$\\
+  $(_{bs}a^*)\,\backslash c$ & $\dn$ &
+      $_{bs}(\textit{fuse}\, [Z] \; r\,\backslash c)\cdot
+       (_{[]}r^*))$
+\end{tabular}    
+\end{center}    
+\noindent
+We give the intuition behind some of the more involved cases in 
+$\bder$. For example,
+in the \emph{star} case,
+a derivative on $_{bs}a^*$ means 
+that one more star iteratoin needs to be taken.
 we need to unfold it into a sequence,
 and attach an additional bit $Z$ to the front of $r \backslash c$
-to indicate one more star iteration. 
-\begin{center}
-  \begin{tabular}{@{}lcl@{}}
-  $\bder \; c\; (_{bs}a^*) $ & $\dn$ &
-      $_{bs}(\textit{fuse}\, [Z] \; \bder \; c \; a)\cdot
-       (_{[]}a^*))$
-\end{tabular}    
-\end{center}    
+as a record to indicate one new star iteration is unfolded.
 
 \noindent
-For most time we use the infix notation $\backslash$ to mean $\bder$ for brevity when
-there is no danger of confusion with derivatives on plain regular expressions, 
-for example, the above can be expressed as
 \begin{center}
   \begin{tabular}{@{}lcl@{}}
   $(_{bs}a^*)\,\backslash c$ & $\dn$ &
-      $_{bs}(\textit{fuse}\, [Z] \; a\,\backslash c)\cdot
+  $_{bs}(\underbrace{\textit{fuse}\, [Z] \; a\,\backslash c}_{\text{One more iteration}})\cdot
        (_{[]}a^*))$
 \end{tabular}    
 \end{center}   
 
 \noindent
-Using the picture we used earlier to depict this, the transformation when 
-taking a derivative w.r.t a star is like below:
+This information will be recovered later by the $\decode$ function.
+The intuition is that the bit $Z$ will be decoded at the right location,
+because we accumulate bits from left to right (a rigorous proof will be given
+later).
 
-\begin{tabular}{@{}l@{\hspace{1mm}}l@{\hspace{0mm}}c@{}}
-\begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
-    \node [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
+\begin{tikzpicture}[ > = stealth, % arrow head style
+        shorten > = 1pt, % don't touch arrow head to node
+        semithick % line style
+    ]
+
+    \tikzstyle{every state}=[
+        draw = black,
+        thin,
+        fill = cyan!29,
+        minimum size = 7mm
+    ]
+    \begin{scope}[node distance=1cm and 0cm, every node/.style=state]
+		\node (k) [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
         {$bs$
          \nodepart{two} $a^*$ };
-%\caption{term 1 \ref{term:1}'s matching configuration}
-\end{tikzpicture} 
-&
+	 \node (l) [below =of k, rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
+        { $bs$ + [Z]
+         \nodepart{two}  $(a\backslash c )\cdot a^*$ };
+    \end{scope}
+    \path[->] 
+	      (k) edge (l);
+\end{tikzpicture}
+
+Pictorially the process looks like below.
+Like before, the red region denotes
+previous lexing information (stored as bitcodes in $bs$).
+
 \begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
-    \node [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
-        {$v_{\text{previous iterations}}$
-         \nodepart{two} $a^*$};
-%\caption{term 1 \ref{term:1}'s matching configuration}
-\end{tikzpicture}
-\\
-\begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
-    \node [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
+	\begin{scope}[node distance=1cm]   
+		\node (a) [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
+        {$bs$
+         \nodepart{two} $a^*$ };
+	 \node (b) [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
         { $bs$ + [Z]
          \nodepart{two}  $(a\backslash c )\cdot a^*$ };
 %\caption{term 1 \ref{term:1}'s matching configuration}
+ 	\end{scope}
 \end{tikzpicture}
-&
-\begin{tikzpicture}[every node/.append style={draw, rounded corners, inner sep=10pt}]
-    \node [rectangle split, rectangle split horizontal, rectangle split parts=2, rectangle split part fill={red!30,blue!20},]
-        {$v_{\text{previous iterations}}$ + 1 more iteration
-         \nodepart{two} $(a\backslash c )\cdot a^*$ };
-%\caption{term 1 \ref{term:1}'s matching configuration}
-\end{tikzpicture}
-\end{tabular}    
+
 \noindent
 Another place in the $\bder$ function where it differs
 from normal derivatives (on un-annotated regular expressions)
@@ -474,25 +508,6 @@
 The rest of the clauses of $\bder$ is rather similar to
 $\der$, and is put together here as a wholesome definition
 for $\bder$:
-\begin{center}
-  \begin{tabular}{@{}lcl@{}}
-  $(\ZERO)\,\backslash c$ & $\dn$ & $\ZERO$\\  
-  $(_{bs}\ONE)\,\backslash c$ & $\dn$ & $\ZERO$\\  
-  $(_{bs}{\bf d})\,\backslash c$ & $\dn$ &
-        $\textit{if}\;c=d\; \;\textit{then}\;
-         _{bs}\ONE\;\textit{else}\;\ZERO$\\  
-  $(_{bs}\sum \;\textit{as})\,\backslash c$ & $\dn$ &
-  $_{bs}\sum\;(\textit{map} \; (\_\backslash c) \; as )$\\
-  $(_{bs}\;a_1\cdot a_2)\,\backslash c$ & $\dn$ &
-     $\textit{if}\;\textit{bnullable}\,a_1$\\
-					       & &$\textit{then}\;_{bs}\sum\,[(_{[]}\,(a_1\,\backslash c)\cdot\,a_2),$\\
-					       & &$\phantom{\textit{then},\;_{bs}\sum\,}(\textit{fuse}\,(\textit{bmkeps}\,a_1)\,(a_2\,\backslash c))]$\\
-  & &$\textit{else}\;_{bs}\,(a_1\,\backslash c)\cdot a_2$\\
-  $(_{bs}a^*)\,\backslash c$ & $\dn$ &
-      $_{bs}(\textit{fuse}\, [Z] \; r\,\backslash c)\cdot
-       (_{[]}r^*))$
-\end{tabular}    
-\end{center}    
 \noindent
 Generalising the derivative operation with bitcodes to strings, we have 
 \begin{center}