--- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Sun Jun 18 17:54:52 2023 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Wed Jun 21 22:43:04 2023 +0100
@@ -16,6 +16,27 @@
and previous
chapters are essential background work for setting the scene of the formal proof we
are about to describe.
+The proof details are necessary materials for this thesis
+because it provides necessary context to explain why we need a
+new framework for the proof of $\blexersimp$, which involves
+simplifications that cause structural changes to the regular expression.
+a new formal proof of the correctness of $\blexersimp$, where the
+proof of $\blexer$
+is not applicatble in the sense that we cannot straightforwardly extend the
+proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
+not hold anymore.
+%This is because the structural induction on the stepwise correctness
+%of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described
+%in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to
+%each other.
+%In this chapter we introduce simplifications
+%for annotated regular expressions that can be applied to
+%each intermediate derivative result. This allows
+%us to make $\blexer$ much more efficient.
+%Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
+%but their simplification functions could have been more efficient and in some cases needed fixing.
+
+
In particular, the correctness theorem
of the un-optimised bit-coded lexer $\blexer$ in
chapter \ref{Bitcoded1} formalised by Ausaf et al.
@@ -25,45 +46,16 @@
$\vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c) \; v= \retrieve \; r \; (\inj \; r\; c\; v)$
\end{center}
This no longer holds once we introduce simplifications.
-To control the size of regular expressions during derivatives,
-one has to eliminate redundant sub-expression with some
-procedure we call $\textit{simp}$,
-and $\textit{simp}$ is defined as
-:
-
-
-
-Having defined the $\textit{bsimp}$ function,
-we add it as a phase after a derivative is taken.
-\begin{center}
- \begin{tabular}{lcl}
- $r \backslash_{bsimp} c$ & $\dn$ & $\textit{bsimp}(r \backslash c)$
- \end{tabular}
-\end{center}
-%Following previous notations
-%when extending from derivatives w.r.t.~character to derivative
-%w.r.t.~string, we define the derivative that nests simplifications
-%with derivatives:%\comment{simp in the [] case?}
-We extend this from characters to strings:
+Simplifications are necessary to control the size of regular expressions
+during derivatives by eliminating redundant
+sub-expression with some procedure we call $\textit{bsimp}$.
+We want to prove the correctness of $\blexersimp$ which integrates
+$\textit{bsimp}$ by applying it after each call to the derivative:
\begin{center}
\begin{tabular}{lcl}
-$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(r \backslash_{bsimp}\, c) \backslash_{bsimps}\, s$ \\
+ $r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
$r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
\end{tabular}
-\end{center}
-
-\noindent
-The lexer that extracts bitcodes from the
-derivatives with simplifications from our $\simp$ function
-is called $\blexersimp$:
-
-\begin{center}
-
-\begin{center}
- \begin{tabular}{lcl}
- $r \backslash_{bsimp} s$ & $\dn$ & $\textit{bsimp}(r \backslash s)$
- \end{tabular}
-\end{center}
\begin{tabular}{lcl}
$\textit{blexer\_simp}\;r\,s$ & $\dn$ &
$\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\
@@ -73,38 +65,24 @@
\end{tabular}
\end{center}
\noindent
-the redundant sub-expressions after each derivative operation
-allows the exact structure of each intermediate result to be preserved,
-so that pairs of inhabitation relations in the form $\vdash v : r_{c} $ and
-$\vdash v^{c} : r $ hold (if we allow the abbreviation $r_{c} \dn r\backslash c$
-and $v^{c} \dn \inj \;r \; c \; v$).
-
-
-Define the
-But $\blexersimp$ introduces simplification after the derivative
-to reduce redundancy,
-yielding $r \backslash c$
-This allows
+Previously without $\textit{bsimp}$ the exact structure of each intermediate
+regular expression is preserved, allowing pairs of inhabitation relations in the form $\vdash v : r_{c} $ and
+$\vdash v^{c} : r $ to hold in lemma \ref{retrieveStepwise}(if
+we use the convenient notation $r_{c} \dn r\backslash c$
+and $v_{r}^{c} \dn \inj \;r \; c \; v$),
+but $\blexersimp$ introduces simplification after the derivative,
+getting us trouble in aligning the pairs:
+\begin{center}
+ $\vdash v: \textit{bsimp} \; r_{c} \implies \retrieve \; \textit{bsimp} \; r_c \; v =\retrieve \; r \;(\mathord{?} v_{r}^{c}) $
+\end{center}
+\noindent
+It is quite clear that once we made
+$v$ to align with $\textit{bsimp} \; r_{c}$
+in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
+in for the above statement to hold.
+Ausaf et al. \cite{AusafUrbanDyckhoff2016}
+made some initial attempts on the un-annotated lexer (to be continued)
-The proof details are necessary materials for this thesis
-because it provides necessary context to explain why we need a
-new framework for the proof of $\blexersimp$, which involves
-simplifications that cause structural changes to the regular expression.
-a new formal proof of the correctness of $\blexersimp$, where the
-proof of $\blexer$
-is not applicatble in the sense that we cannot straightforwardly extend the
-proof of theorem \ref{blexerCorrect} because lemma \ref{flex_retrieve} does
-not hold anymore.
-This is because the structural induction on the stepwise correctness
-of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described
-in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to
-each other.
-In this chapter we introduce simplifications
-for annotated regular expressions that can be applied to
-each intermediate derivative result. This allows
-us to make $\blexer$ much more efficient.
-Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
-but their simplification functions could have been more efficient and in some cases needed fixing.