lexing: comparison ChengsongTanPhdThesis/Chapters/Bitcoded2.tex

equal deleted inserted replaced

-:ef2b8abcbc55
+:a365d1364640
 is the point from which novel contributions of this PhD project are introduced
 in detail,
 and previous
 chapters are essential background work for setting the scene of the formal proof we
 are about to describe.
+The proof details are necessary materials for this thesis
+because it provides necessary context to explain why we need a
+new framework for the proof of $\blexersimp$, which involves
+simplifications that cause structural changes to the regular expression.
+a new formal proof of the correctness of $\blexersimp$, where the
+proof of $\blexer$
+is not applicatble in the sense that we cannot straightforwardly extend the
+proof of theorem \ref{blexerCorrect} because lemma \ref{retrieveStepwise} does
+not hold anymore.
+%This is because the structural induction on the stepwise correctness
+%of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described
+%in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to
+%each other.
+%In this chapter we introduce simplifications
+%for annotated regular expressions that can be applied to
+%each intermediate derivative result. This allows
+%us to make $\blexer$ much more efficient.
+%Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
+%but their simplification functions could have been more efficient and in some cases needed fixing.
 In particular, the correctness theorem
 of the un-optimised bit-coded lexer $\blexer$ in
 chapter \ref{Bitcoded1} formalised by Ausaf et al.
 relies on lemma \ref{retrieveStepwise} that says
 any value can be retrieved in a stepwise manner:
 \begin{center}
 	$\vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c)  \;  v= \retrieve \; r \; (\inj \; r\; c\; v)$
 \end{center}
 This no longer holds once we introduce simplifications.
-To control the size of regular expressions during derivatives,
+Simplifications are necessary to control the size of regular expressions
-one has to eliminate redundant sub-expression with some
+during derivatives by eliminating redundant
-procedure we call $\textit{simp}$,
+sub-expression with some procedure we call $\textit{bsimp}$.
-and $\textit{simp}$ is defined as
+We want to prove the correctness of $\blexersimp$ which integrates
-:
+$\textit{bsimp}$ by applying it after each call to the derivative:
-Having defined the $\textit{bsimp}$ function,
-we add it as a phase after a derivative is taken.
-\begin{center}
-	\begin{tabular}{lcl}
-		$r \backslash_{bsimp} c$ & $\dn$ & $\textit{bsimp}(r \backslash c)$
-	\end{tabular}
-\end{center}
-%Following previous notations
-%when extending from derivatives w.r.t.~character to derivative
-%w.r.t.~string, we define the derivative that nests simplifications
-%with derivatives:%\comment{simp in  the [] case?}
-We extend this from characters to strings:
 \begin{center}
 \begin{tabular}{lcl}
-$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(r \backslash_{bsimp}\, c) \backslash_{bsimps}\, s$ \\
+	$r \backslash_{bsimps} (c\!::\!s) $ & $\dn$ & $(\textit{bsimp} \; (r \backslash\, c)) \backslash_{bsimps}\, s$ \\
 $r \backslash_{bsimps} [\,] $ & $\dn$ & $r$
 \end{tabular}
-\end{center}
-\noindent
-The lexer that extracts bitcodes from the
-derivatives with simplifications from our $\simp$ function
-is called $\blexersimp$:
-\begin{center}
-\begin{center}
-	\begin{tabular}{lcl}
-		$r \backslash_{bsimp} s$ & $\dn$ & $\textit{bsimp}(r \backslash s)$
-	\end{tabular}
-\end{center}
 \begin{tabular}{lcl}
 $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
 $\textit{let}\;a = (r^\uparrow)\backslash_{bsimp}\, s\;\textit{in}$\\
 & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
 & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
 & & $\;\;\textit{else}\;\textit{None}$
 \end{tabular}
 \end{center}
 \noindent
-the redundant sub-expressions after each derivative operation
+Previously without $\textit{bsimp}$ the exact structure of each intermediate
-allows the exact structure of each intermediate result to be preserved,
+regular expression is preserved, allowing pairs of inhabitation relations in the form $\vdash v : r_{c} $ and
-so that pairs of inhabitation relations in the form $\vdash v : r_{c} $ and
+$\vdash v^{c} : r $ to hold in lemma \ref{retrieveStepwise}(if
-$\vdash v^{c} : r $ hold (if we allow the abbreviation $r_{c} \dn r\backslash c$
+we use the convenient notation $r_{c} \dn r\backslash c$
-and $v^{c} \dn \inj \;r \; c \; v$).
+and $v_{r}^{c} \dn \inj \;r \; c \; v$),
+but $\blexersimp$ introduces simplification after the derivative,
+getting us trouble in aligning the pairs:
-Define the
+\begin{center}
-But $\blexersimp$ introduces simplification after the derivative
+	$\vdash v: \textit{bsimp} \; r_{c} \implies \retrieve \; \textit{bsimp} \; r_c \; v =\retrieve \; r  \;(\mathord{?} v_{r}^{c}) $
-to reduce redundancy,
+\end{center}
-yielding $r \backslash c$
+\noindent
-This allows
+It is quite clear that once we made
+$v$ to align with $\textit{bsimp} \; r_{c}$
-The proof details are necessary materials for this thesis
+in the inhabitation relation, something different than $v_{r}^{c}$ needs to be plugged
-because it provides necessary context to explain why we need a
+in for the above statement to hold.
-new framework for the proof of $\blexersimp$, which involves
+Ausaf et al. \cite{AusafUrbanDyckhoff2016}
-simplifications that cause structural changes to the regular expression.
+made some initial attempts on the un-annotated lexer (to be continued)
-a new formal proof of the correctness of $\blexersimp$, where the
-proof of $\blexer$
-is not applicatble in the sense that we cannot straightforwardly extend the
-proof of theorem \ref{blexerCorrect} because lemma \ref{flex_retrieve} does
-not hold anymore.
-This is because the structural induction on the stepwise correctness
-of $\inj$ breaks due to each pair of $r_i$ and $v_i$ described
-in chapter \ref{Inj} and \ref{Bitcoded1} no longer correspond to
-each other.
-In this chapter we introduce simplifications
-for annotated regular expressions that can be applied to
-each intermediate derivative result. This allows
-us to make $\blexer$ much more efficient.
-Sulzmann and Lu already introduced some simplifications for bitcoded regular expressions,
-but their simplification functions could have been more efficient and in some cases needed fixing.
 From this chapter we start with the main contribution of this thesis, which

changeset 650	a365d1364640
parent 649	ef2b8abcbc55
child 651	deb35fd780fe