lexing: comparison ChengsongTanPhdThesis/Chapters/Inj.tex

equal deleted inserted replaced

-:6da4516ea87d
+:660cf698eb26
 a regular expression is defined recursively as
 a set of strings:
 %TODO: FILL in the other defs
 \begin{center}
 \begin{tabular}{lcl}
-$L \; \ZERO$ & $\dn$ & $\phi$\\
+$L \; \ZERO$ & $\dn$ & $\varnothing$\\
 $L \; \ONE$ & $\dn$ & $\{[]\}$\\
 $L \; c$ & $\dn$ & $\{[c]\}$\\
 $L \; (r_1 + r_2)$ & $\dn$ & $ L \; r_1 \cup L \; r_2$\\
 $L \; (r_1 \cdot r_2)$ & $\dn$ & $ L \; r_1 @ L \; r_2$\\
 $L \; (r^*)$ & $\dn$ & $ (L\;r)*$
 to inject the character back into.
 Once the character is
 injected back to that sub-value;
 $\inj$ assembles all parts
 to form a new value.
-\subsection{An Example of how Injection Works}
-We will provide a few examples on why
-\begin{center}
-	\begin{tabular}{lcl}
-		$\inj \; (a\cdot b)\cdot c \; \Seq \; \ONE \; b$ & $=$ & $(a+e)$\\
-		$A$ & $B$ & $C$
-	\end{tabular}
-\end{center}
 For instance, the last clause is an
 injection into a sequence value $v_{i+1}$
 whose second child
 value is a star and the shape of the
 regular expression $r_i$ before injection
 Finally,
 $\inj \; r \;c \; v$ is prepended
 to the previous list of iterations and then
 wrapped under the $\Stars$
 constructor, giving us $\Stars \; ((\inj \; r \; c \; v) ::vs)$.
+Putting all together, Sulzmann and Lu obtained the following algorithm
+outlined in the injection-based lexing diagram \ref{graph:inj}:
+\begin{center}
+\begin{tabular}{lcl}
+	$\lexer \; r \; [] $ & $=$ & $\textit{if} \; (\nullable \; r)\; \textit{then}\;  \Some(\mkeps \; r) \; \textit{else} \; \None$\\
+	$\lexer \; r \;c::s$ & $=$ & $\textit{case}\; (\lexer \; (r\backslash c) \; s) \;\textit{of}\; $\\
+	& & $\quad \phantom{\mid}\; \None \implies \None$\\
+	& & $\quad \mid           \Some(v) \implies \Some(\inj \; r\; c\; v)$
+\end{tabular}
+\end{center}
+\noindent
+\subsection{Examples on How Injection and Lexer Works}
+We will provide a few examples on how $\inj$ and $\lexer$
+works by showing their values in each recursive call on some
+concrete examples.
+We start with the call $\lexer \; (a+aa)^*\cdot c\; aac$
+on the lexer, note that the value's character constructor
+$\Char \;c$ is abbreviated as $c$ for readability.
+Similarly the last derivative's sub-expression
+is abbreviated as $r_{=0}$\footnote{which is equal to
+$((\ZERO + (\ZERO a+ \ZERO))\cdot(a+aa)^* + (\ZERO + \ZERO a)\cdot (a+aa)^*)+
+((\ZERO+\ZERO a)\cdot(a+aa)^* + (\ZERO+\ZERO a)\cdot(a+aa)^*),
+$.}
+whose language interpretation is equivalent to that of
+$\ZERO$ and therefore not crucial to be displayed fully expanded,
+as they will not be injected into.%our example run.
+\begin{center}
+	\begin{tabular}{lcl}
+		$(a+\textcolor{magenta}{a}\textcolor{blue}{a})^* \cdot \textcolor{red}{c}$ & $\stackrel{\backslash \textcolor{magenta}{a}}{\rightarrow}$ & $((\ONE + \textcolor{magenta}{\ONE} \textcolor{blue}{a})\cdot (a+aa)^*)\cdot \textcolor{red}{c}+\ZERO$\\
+				   %& $\stackrel{\rightarrow}{\backslash a}$ & $((\ONE + \ONE a)\cdot (a+aa)^*)\cdot c + \ZERO$\\
+				    & $\stackrel{\backslash \textcolor{blue}{a}}{\rightarrow}$  &
+				    $(((\ZERO+(\ZERO a+ \textcolor{blue}{\ONE}))\cdot (a+aa)^* + (\ONE+\ONE a)\cdot (a+aa)^* )\cdot \textcolor{red}{\mathbf{c}} + \ZERO)+\ZERO$\\
+				    & $\stackrel{\backslash \textcolor{red}{c}}{\rightarrow}$  &
+				    $((r_{=0}\cdot c + \textcolor{red}{\ONE})+\ZERO)+\ZERO$\\
+				    & $\stackrel{\mkeps}{\rightarrow}$ & $\Left (\Left \; (\Right \; \textcolor{red}{\Empty}))$ \\
+				   & $\stackrel{\inj \;\textcolor{red}{c} }{\rightarrow}$ &
+				   $\Left \; (\Left \; (\Seq \;(\Left \; (\Seq \; (\Right \; (\Right\; \textcolor{blue}{\Empty})) \; \Stars \, [])) \; \textcolor{red}{c}))$\\
+				   & $\stackrel{\inj \;\textcolor{blue}{a}}{\rightarrow}$ &
+				   $\Left\; (\Seq \; (\Seq\; (\Right \; (\Seq \; \textcolor{magenta}{\Empty }\; \textcolor{blue}{ \mathbf{a}}) )
+				   \;\Stars \,[]) \; \textcolor{red}{c})$\\
+				   & $\stackrel{\inj \;\textcolor{magenta}{a}}{\rightarrow}$ & $\Seq \; (\Stars \; [\Right \; (\Seq \; \mathbf{\textcolor{magenta}{a}} \; \textcolor{blue}{a})]) \; \textcolor{red}{ c}$
+				   %$\inj \; r \; c\A$
+		%$\inj \; (a\cdot b)\cdot c \; \Seq \; \ONE \; b$ & $=$ & $(a+e)$\\
+	\end{tabular}
+\end{center}
+\noindent
+We have assigned different colours for each character,
+as well as their corresponding locations in values and regular expressions.
+The most recently injected character is marked with a bold font.
+%Their corresponding derivative steps have been marked with the same
+%colour. We also mark the specific location where the coloured character
+%was injected into with the same colour, before and after the injection occurred.
+%TODO: can be also used to motivate injection->blexer in Bitcoded1
+To show the details of how $\inj$ works,
+we zoom in to the second injection above,
+illustrating the recursive calls involved:
+\begin{center}
+	\begin{tabular}{lcl}
+$\inj \;\quad ((\ONE + {\ONE}
+	\textcolor{blue}{a})\cdot (a+aa)^*)\cdot
+	c + \ZERO \; \quad \textcolor{blue}{a} \; $ & &\\
+$\quad \Left \;
+(\Left \; (\Seq \;(\Left \; (\Seq \; (\Right \; (\Right\;
+	\textcolor{blue}{\Empty})) \; \Stars \, [])) \; c))$ & \\$=$\\
+	$\Left \; (\inj \; ((\ONE + \ONE
+	\textcolor{blue}{a})\cdot (a+aa)^*)\cdot
+	c \quad  \textcolor{blue}{a} \quad (\Left \; (\Seq\; ( \Left \; (\Seq \; (\Right \; (\Right\;
+	\textcolor{blue}{\Empty})) \; \Stars \, []) ) \; c ) )\;\;)$ &\\ $=$\\
+	$\Left \; (\Seq \; (\inj \; (\ONE + \ONE \textcolor{blue}{a})\cdot(a+aa)^* \quad \textcolor{blue}{a} \quad
+	(\Left \; (\Seq \; (\Right \; (\Right\;\textcolor{blue}{\Empty})))) ) \; c ) $ & \\$=$\\
+	$\Left \; (\Seq \; (\Seq \; (\inj \quad (\ONE + \ONE \textcolor{blue}{a}) \quad \textcolor{blue}{a}\quad \Right \;(\Right \; \textcolor{blue}{\Empty}) ) \; \Stars \,[])\; c)$ &\\ $=$ \\
+	$\Left \; (\Seq \; (\Seq \; (\Right \; (\inj \; \ONE \textcolor{blue}{a} \quad \textcolor{blue}{a}\quad (\Right \;\textcolor{blue}{\Empty})))) \; \Stars \, [])$
+																		   & \\ $=$\\
+																		   $\Left \; (\Seq \; (\Seq \; (\Right \; (\Seq \; (\mkeps \; \ONE)\;(\inj \;\textcolor{blue}{a} \; \textcolor{blue}{a} \; \textcolor{blue}{\Empty})) ) ) \; \Stars \, [] )$
+																		   & \\ $=$\\
+																		   $\Left \; (\Seq \; (\Seq \; (\Right \; (\Seq \; \Empty \; \textcolor{blue}{a}) ) ) \; \Stars \, [] )$
+	\end{tabular}
+\end{center}
+\noindent
+We will now introduce the proofs related to $\inj$ and
+$\lexer$. These proofs have originally been found by Ausaf et al.
+in their 2016 work \cite{AusafDyckhoffUrban2016}.
+Proofs to some of these lemmas have also been discussed in more detail in
+Ausaf's thesis \cite{Ausaf}.
+Nevertheless, we still introduce these proofs, to make this thesis
+self-contained as we show inductive variants used in these proofs break
+after more simplifications are introduced.
+%Also note that lemmas like \ref{}described in this thesis is a more faithful
+%representation of the actual accompanying Isabelle formalisation, and
+%therefore
 Recall that lemma
 \ref{mePosix} tells us that
 $\mkeps$ always generates the POSIX value.
 The function $\inj$ preserves the POSIXness, provided
 			as the POSIX value for $a\cdot b$.
 	\end{itemize}
 	The star case can be proven similarly.
 \end{proof}
 \noindent
-Putting all together, Sulzmann and Lu obtained the following algorithm
-outlined in the diagram \ref{graph:inj}:
-\begin{center}
-\begin{tabular}{lcl}
-	$\lexer \; r \; [] $ & $=$ & $\textit{if} \; (\nullable \; r)\; \textit{then}\;  \Some(\mkeps \; r) \; \textit{else} \; \None$\\
-	$\lexer \; r \;c::s$ & $=$ & $\textit{case}\; (\lexer \; (r\backslash c) \; s) \;\textit{of}\; $\\
-	& & $\quad \phantom{\mid}\; \None \implies \None$\\
-	& & $\quad \mid           \Some(v) \implies \Some(\inj \; r\; c\; v)$
+%TODO: Cut from previous lexer def, need to make coherent
-\end{tabular}
-\end{center}
-\noindent
 The central property of the $\lexer$ is that it gives the correct result
 according to
 POSIX rules.
 \begin{theorem}\label{lexerCorrectness}
 	The $\lexer$ based on derivatives and injections is correct:

changeset 667	660cf698eb26
parent 666	6da4516ea87d
child 668	3831621d7b14