--- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Sat Jul 08 22:18:22 2023 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Sun Jul 09 00:29:02 2023 +0100
@@ -453,7 +453,7 @@
we add it as a phase after a derivative is taken.
\begin{center}
\begin{tabular}{lcl}
- $r \backslash_{bsimp} c$ & $\dn$ & $\textit{bsimp}(r \backslash c)$
+ $a \backslash_{bsimp} c$ & $\dn$ & $\textit{bsimp}(a \backslash c)$
\end{tabular}
\end{center}
%Following previous notations
@@ -463,8 +463,8 @@
We extend this from characters to strings:
\begin{center}
\begin{tabular}{lcl}
-$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(r \backslash_{bsimp}\, c) \backslash_{bsimps}\, s$ \\
-$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
+$a \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(a \backslash_{bsimp}\, c) \backslash_{bsimps}\, s$ \\
+$a \backslash_{bsimps} [\,] $ & $\dn$ & $a$
\end{tabular}
\end{center}
@@ -532,15 +532,27 @@
Given the size difference, it is not
surprising that our $\blexersimp$ significantly outperforms
$\textit{blexer\_SLSimp}$ by Sulzmann and Lu.
+Indeed $\blexersimp$ seems to be a correct algorithm that effectively
+bounds the size of intermediate representations.
+\marginpar{\em more connecting material to make narration more coherent.}
+As promised we will use formal proofs to show that our speculation
+based on these experimental results indeed hold.
+%intuitions indeed hold.
In the next section we are going to establish that our
simplification preserves the correctness of the algorithm.
-
+\section{Correctness of $\blexersimp$}
+A natural thought would be to directly re-use the formal
+proof of $\blexer$'s correctness, with some minor modifications
+but keeping the way the induction is done.
+However we were not able to find a simple way to re-factor
+proof of \ref{blexerCorrect} in chapter \ref{Bitcoded1}.
-\section{Why $\textit{Blexer}$'s Proof Does Not Work}
-The fundamental reason is we cannot extend the correctness proof of theorem 4
-because lemma 13 does not hold anymore when simplifications are involved.
+\subsection{Why $\textit{Blexer}$'s Proof Does Not Work}
+The fundamental reason is %we cannot extend the correctness proof of theorem 4
+because lemma \ref{retrieveStepwise} does not hold
+anymore when simplifications are involved.
\marginpar{\em rephrased things \\so why new \\proof makes sense.}
%The proof details are necessary materials for this thesis
%because it provides necessary context to explain why we need a
@@ -569,34 +581,132 @@
relies crucially on lemma \ref{retrieveStepwise} that says
any value can be retrieved in a stepwise manner, namely:
\begin{equation}\label{eq:stepwise}%eqref: this proposition needs to be referred
- \vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c) \; v= \retrieve \; r \; (\inj \; r\; c\; v)
+ \vdash v : ((a_\downarrow) \backslash c) \implies \retrieve \; (a \backslash c) \; v= \retrieve \; a \; (\inj \; (a_\downarrow) \; c\; v)
\end{equation}
%This no longer holds once we introduce simplifications.
-Simplifications are necessary to control the size of derivatives,
-but they also destroy the structures of the regular expressions
-such that \ref{eq:stepwise} does not hold.
+The regular expressions $a$ and $a\backslash c$ correspond to the intermediate
+result before and after the derivative with respect to $c$, and similarly
+$\inj\; a_\downarrow \; c \; v$ and $v$ correspond to the value before and after the derivative.
+They go in lockstep pairs
+\[
+ (a, \; \inj\; a_\downarrow \; c \; v)\; \text{and} \; (a\backslash c,\; v)
+\]
+and the structure of annotated regular expression and
+value within a pair always align with each other.
-
-We want to prove the correctness of $\blexersimp$ which integrates
-$\textit{bsimp}$ by applying it after each call to the derivative:
+As $\blexersimp$ integrates
+$\textit{bsimp}$ by applying it after each call to the derivatives function,
+%Simplifications are necessary to control the size of derivatives,
+%but they also destroy the structures of the regular expressions
+%such that \ref{eq:stepwise} does not hold.
\begin{center}
\begin{tabular}{lcl}
- $r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
-$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
+ $a \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (a \backslash\, c)) \backslash_{bsimps}\, s$ \\
+%$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
\end{tabular}
-\begin{tabular}{lcl}
- $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
- $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\
- & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
- & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
- & & $\;\;\textit{else}\;\textit{None}$
-\end{tabular}
+%\begin{tabular}{lcl}
+% $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
+% $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\
+% & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
+% & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
+% & & $\;\;\textit{else}\;\textit{None}$
+%\end{tabular}
\end{center}
\noindent
+it becomes a problem to maintain a similar property as \ref{retrieveStepwise}.
Previously without $\textit{bsimp}$ the exact structure of each intermediate
-regular expression is preserved, allowing pairs of inhabitation relations
-in the form $\vdash v : r \backslash c $ and
-$\vdash \inj \; r\; c \; v : r $ to hold in \ref{eq:stepwise}.
+regular expression is preserved,
+%allowing pairs of inhabitation relations
+%in the form $\vdash v : r \backslash c $ and
+%$\vdash \inj \; r\; c \; v : r $ to hold in \ref{eq:stepwise}.
+We can illustrate this using the diagram \ref{fig:inj} in chapter \ref{Inj},
+by zooming in to the middle bit involving $r_i$, $r_{i+1}$, $v_i$ and $v_{i+1}$,
+and adding the bottom row to show how bitcodes encoding the lexing information
+can be extracted from every pair $(r_i, \; v_i)$:
+\begin{center}
+ \begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick]
+ %\node [rectangle ] (1) at (-7, 2) {$\ldots$};
+ %\node [rectangle, draw] (2) at (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$};
+ %\node [rectangle, draw] (3) at (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$};
+ %\node [rectangle] (4) at (9, 2) {$\ldots$};
+ %\node [rectangle] (5) at (-7, -2) {$\ldots$};
+ %\node [rectangle, draw] (6) at (-4, -2) {$v_i = \Stars \; [\Left (a)]$};
+ %\node [rectangle, draw] (7) at ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$};
+ %\node [rectangle] (8) at ( 9, -2) {$\ldots$};
+ %\node [rectangle] (9) at (-7, -6) {$\ldots$};
+ %\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = bs' @ ZZS$};
+ %\node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$};
+ %\node [rectangle] (12) at (9, -6) {$\ldots$};
+
+ \node [rectangle ] (1) at (-8, 2) {$\ldots$};
+ \node [rectangle, draw] (2) at (-5, 2) {$r_i = a_\downarrow$};
+ \node [rectangle, draw] (3) at (3, 2) {$r_{i+1} = (a\backslash c)_\downarrow$};
+ \node [rectangle] (4) at (8, 2) {$\ldots$};
+ \node [rectangle] (5) at (-8, -2) {$\ldots$};
+ \node [rectangle, draw] (6) at (-5, -2) {$v_i = \inj\; r \; c \; v$};
+ \node [rectangle, draw] (7) at ( 3, -2) {$v_{i+1} = v$};
+ \node [rectangle] (8) at ( 8, -2) {$\ldots$};
+ \node [rectangle] (9) at (-8, -6) {$\ldots$};
+ \node [rectangle, draw] (10) at (-5, -6) {$\textit{bits}_{i} = bs' @ ZZS$};
+ \node [rectangle, draw] (11) at (3, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$};
+ \node [rectangle] (12) at (8, -6) {$\ldots$};
+
+
+
+ \path (1) edge [] node {} (2);
+ \path (6) edge [] node {} (5);
+ \path (9) edge [] node {} (10);
+
+ \path (11) edge [] node {} (12);
+ \path (8) edge [] node {} (7);
+ \path (3) edge [] node {} (4);
+
+
+ \path (6) edge [dashed,bend right = 30] node {$\retrieve \; a_i \; v_i$} (10);
+ \path (2) edge [dashed,bend left = 48] node {} (10);
+
+ \path (7) edge [dashed,bend right = 30] node {$\retrieve \; a_{i+1} \; v_{i+1}$} (11);
+ \path (3) edge [dashed,bend left = 45] node {} (11);
+
+ \path (2) edge [] node {$\backslash c$} (3);
+ \path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6);
+ \path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7);
+ %\path (6) edge [] node {$\vdash v_i : r_i$} (10);
+ %\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11);
+
+ \path (10) edge [dashed, <->] node {$=$} (11);
+
+ \path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6);
+
+% \node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$};
+% \node [rectangle, draw] (a) at (-6, 4) {$(aa)^*(_{Z}b + _{S}c)$};
+% \path (r)
+% edge [] node {$\internalise$} (a);
+% \node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$};
+% \path (a)
+% edge [] node {$\backslash a$} (a1);
+%
+% \node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$};
+% \node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$};
+% \path (a1)
+% edge [] node {$\backslash a$} (a21);
+% \node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$};
+% \path (a22)
+% edge [] node {$\backslash b$} (a3);
+% \path (a21)
+% edge [dashed, bend right] node {} (a3);
+% \node [rectangle, draw] (bs) at (2, 4) {$ZSZ$};
+% \path (a3)
+% edge [below] node {$\bmkeps$} (bs);
+% \node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$};
+% \path (bs)
+% edge [] node {$\decode$} (v);
+
+
+ \end{tikzpicture}
+ %\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$}
+\end{center}
+
But $\blexersimp$ introduces simplification after the derivative,
making it difficult to align the structures of values and regular expressions.
If we change the form of property \ref{eq:stepwise} to
@@ -661,7 +771,7 @@
-\section{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong}
+\subsection{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong}
%From this chapter we start with the main contribution of this thesis, which
@@ -796,9 +906,6 @@
It does not require $v_i$ to be a POSIX value
-{\color{red} \rule{\linewidth}{0.5mm}}
-New content ends
-{\color{red} \rule{\linewidth}{0.5mm}}
@@ -825,7 +932,6 @@
%----------------------------------------------------------------------------------------
% SECTION rewrite relation
%----------------------------------------------------------------------------------------
-\section{Correctness of $\blexersimp$}
We first introduce the rewriting relation \emph{rrewrite}
($\rrewrite$) between two regular expressions,
which stands for an atomic