lexing: comparison ChengsongTanPhdThesis/Chapters/Bitcoded1.tex

equal deleted inserted replaced

-:753a3b0ee02b
+:00171b627b8d
 \noindent
 As stated, $\retrieve$ collects the right bit-codes from the
 final derivative $a_n$, guided by $v_n$. This can be proved
 as follows:
 \begin{lemma}\label{bmkepsRetrieve}
-	$\bnullable \; a \implies \bmkeps \; a = \retrieve \; a \; (\mkeps \; (\erase \; a))$
+	\mbox{}
+	\begin{itemize}
+		\item
+If $\forall v \in vs. \vdash v : r$, and  $\code \; v = \retrieve \; (\internalise\; r) \; v$\\
+then $\code \; (\Stars \; vs) = \retrieve \; _{[]}((\internalise \;  r)^*) \; (\Stars \; vs)$
+		\item
+	$\vdash v : r \implies \retrieve \; (r)^\uparrow \; v = \code \; v$
+		\item
+	$\bnullable \; a \implies \bmkeps \; a = \retrieve \; a \; (\mkeps \; (a_\downarrow))$
+	\end{itemize}
+\begin{proof}
+	The first part is by induction on the value list $vs$.
+	The second part is by induction on $r$,
+	and the star case uses the first part.
+	The last part is by a routine induction on $a$.
+\end{proof}
+\noindent
 \end{lemma}
 \begin{proof}
 	By a routine induction on $a$.
 \end{proof}
 \noindent
 %The design of $\retrieve$ enables us to extract bitcodes
 %from both annotated regular expressions and values.
 %$\retrieve$ always ``sucks up'' all the information.
-When more information is stored in the value, we would be able to
+%When more information is stored in the value, we would be able to
-extract more from the value, and vice versa.
+%extract more from the value, and vice versa.
-For example in star iterations, $\retrieve$ will be able to extract from $\Stars \; vs$
+%For example in star iterations, $\retrieve$ will be able to extract from $\Stars \; vs$
-exactly the same bitcodes as $\code \; (\Stars \; vs)$:
+%exactly the same bitcodes as $\code \; (\Stars \; vs)$:
-\begin{lemma}\label{retrieveEncodeSTARS}
+%\begin{lemma}\label{retrieveEncodeSTARS}
-If $\forall v \in vs. \vdash v : r$, and  $\code \; v = \retrieve \; (\internalise\; r) \; v$\\
+%  If $\forall v \in vs. \vdash v : r$, and  $\code \; v = \retrieve \; (\internalise\; r) \; v$\\
-then $\code \; (\Stars \; vs) = \retrieve \; _{[]}((\internalise \;  r)^*) \; (\Stars \; vs)$
+%  then $\code \; (\Stars \; vs) = \retrieve \; _{[]}((\internalise \;  r)^*) \; (\Stars \; vs)$
-\end{lemma}
+%\end{lemma}
-\begin{proof}
+%\begin{proof}
-	By induction on the value list $vs$.
+%	By induction on the value list $vs$.
-\end{proof}
+%\end{proof}
-\noindent
+%\noindent
-Similarly, when the value list does not hold information,
+%Similarly, when the value list does not hold information,
-only the bitcodes plus some delimiter will be recorded:
+%only the bitcodes plus some delimiter will be recorded:
-\begin{center}
+%\begin{center}
-$\retrieve \; _{bs}((\internalise \;  r)^*) \; (\Stars \; [] )  = bs @ [S]$.
+%  $\retrieve \; _{bs}((\internalise \;  r)^*) \; (\Stars \; [] )  = bs @ [S]$.
-\end{center}
+%\end{center}
-In general, if we have a regular expression that is just internalised
+%In general, if we have a regular expression that is just internalised
-and the final lexing result value, we can $\retrieve$ the bitcodes
+%and the final lexing result value, we can $\retrieve$ the bitcodes
-that look as if we have en$\code$-ed the value as bitcodes:
+%that look as if we have en$\code$-ed the value as bitcodes:
-\begin{lemma}
+%\begin{lemma}
-	$\vdash v : r \implies \retrieve \; (r)^\uparrow \; v = \code \; v$
+%	$\vdash v : r \implies \retrieve \; (r)^\uparrow \; v = \code \; v$
-\end{lemma}
+%\end{lemma}
-\begin{proof}
+%\begin{proof}
-	By induction on $r$.
+%	By induction on $r$.
-	The star case uses lemma \ref{retrieveEncodeSTARS}.
+%	The star case uses lemma \ref{retrieveEncodeSTARS}.
-\end{proof}
+%\end{proof}
-\noindent
+%\noindent
 The following property of $\retrieve$ is crucial, as
 it provides a "bridge" between $a_0$ and $a_n $ in the
 lexing diagram by linking adjacent regular expressions $a_i$ and
 $a_{i+1}$.
 The property says that one
 can retrieve the same bits from a derivative
 regular expression as those from
 before the derivative took place,
 provided that the corresponding values are used respectively:
 \begin{lemma}\label{retrieveStepwise}
-	$\vdash v : (r\backslash c) \implies \retrieve \; (r \backslash c)  \;  v= \retrieve \; r \; (\inj \; r\; c\; v)$
+	$\vdash v : (a\backslash c)_\downarrow \implies \retrieve \; (a \backslash c)  \;  v=
+							\retrieve \; a \; (\inj \; a_\downarrow\; c\; v)$
 \end{lemma}
 \begin{proof}
 	We do an induction on $r$, generalising over $v$.
 	The induction principle in our formalisation
 	is function $\erase$'s cases.
 information being preserved and recoverable throughout each lexing step.
 We shall see in the next chapter that this no longer
 holds with simplifications which takes away certain sub-expressions
 corresponding to non-POSIX lexical values.
-Before we move on to the next
+%Before we move on to the next
-helper function, we rewrite $\blexer$ in
+%helper function, we rewrite $\blexer$ in
-the following way which makes later proofs more convenient:
+%the following way which makes later proofs more convenient:
-\begin{lemma}\label{blexer_retrieve}
+%\begin{lemma}\label{blexer_retrieve}
-$\blexer \; r \; s = \decode  \; (\retrieve \; (\internalise \; r) \; (\mkeps \; (r \backslash s) )) \; r$
+%$\blexer \; r \; s = \decode  \; (\retrieve \; (\internalise \; r) \; (\mkeps \; (r \backslash s) )) \; r$
-\end{lemma}
+%\end{lemma}
-\begin{proof}
+%\begin{proof}
-	Using lemma \ref{bmkepsRetrieve}.
+%	Using lemma \ref{bmkepsRetrieve}.
-\end{proof}
+%\end{proof}
-\noindent
+%\noindent
 The function $\retrieve$ allows connecting
 between the intermediate
 results $a_i$ and $a_{i+1}$ in $\blexer$.
 For plain regular expressions something similar is required.
 and finally give the correctness of $\blexer$--it outputs the same result as $\lexer$:
 \begin{theorem}\label{blexerCorrect}
 	$\blexer\; r \; s = \lexer \; r \; s$
 \end{theorem}
 \begin{proof}
-	From \ref{flex_retrieve} we have that
+	We can rewrite the expression $\blexer \; r \; s$ by using lemma \ref{bmkepsRetrieve}:
-	$\textit{flex} \; r \; \textit{id} \; s \; \mkeps(r \backslash s)  = \blexer \; r \; s$,
+\[
-	which immediately implies this theorem.
+	\blexer \; r \; s = \decode  \; (\retrieve \; (\; r^\uparrow) \; (\mkeps \; (r \backslash s) )) \; r
+\]
+	From \ref{flex_retrieve} we have that the left-hand-side is equal to
+\[
+	\textit{flex} \; r \; \textit{id} \; s \; \mkeps \; (r \backslash s),
+\]
+which in turn equals
+\[
+	\lexer \; r \; s.
+\]
+	%which immediately implies this theorem.
 \end{proof}
 \noindent
-To piece things together and spell out the correctness
+To piece things together
-result of the bitcoded lexer more explicitly,
+%and spell out the correctness
+%result of the bitcoded lexer more explicitly,
+and give the correctness result explicitly,
 we use the fact from the previous chapter that
 \[
 	 (r, s) \rightarrow v \;\; \textit{iff} \;\; \lexer \; r \; s =\Some \;  v
 	 \quad \quad
 	 \nexists v. (r, s) \rightarrow v\;\; \textit{iff} \;\; \lexer \;r\;s = None
 \]
 to obtain this corollary:
 \begin{corollary}\label{blexerPOSIX}
-	The $\blexer$ function correctly implements a POSIX lexer, namely\\
+	The $\blexer$ function correctly implements a POSIX lexer, namely
-	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer \; r \; s = \Some \; v$\\
+	\begin{itemize}
+	\item
+	$(r, s) \rightarrow v \;\; \textit{iff} \;\; \blexer \; r \; s = \Some \; v$
+	\item
 	$\nexists v. (r, s) \rightarrow v\;\; \textit{iff} \;\; \blexer \;r\;s = None$
+	\end{itemize}
 \end{corollary}
 Our main reason for analysing the bit-coded algorithm over
 the injection-based one is that it allows us to define
 more aggressive simplifications.
 We will elaborate on this in the next chapter.

changeset 657	00171b627b8d
parent 653	bc5571c38d1f
child 667	660cf698eb26