diff -r 6da4516ea87d -r 660cf698eb26 ChengsongTanPhdThesis/Chapters/Inj.tex --- a/ChengsongTanPhdThesis/Chapters/Inj.tex Mon Jul 24 11:09:48 2023 +0100 +++ b/ChengsongTanPhdThesis/Chapters/Inj.tex Tue Jul 25 17:28:29 2023 +0100 @@ -1332,7 +1332,7 @@ %TODO: FILL in the other defs \begin{center} \begin{tabular}{lcl} -$L \; \ZERO$ & $\dn$ & $\phi$\\ +$L \; \ZERO$ & $\dn$ & $\varnothing$\\ $L \; \ONE$ & $\dn$ & $\{[]\}$\\ $L \; c$ & $\dn$ & $\{[c]\}$\\ $L \; (r_1 + r_2)$ & $\dn$ & $ L \; r_1 \cup L \; r_2$\\ @@ -2072,15 +2072,7 @@ injected back to that sub-value; $\inj$ assembles all parts to form a new value. -\subsection{An Example of how Injection Works} -We will provide a few examples on why -\begin{center} - \begin{tabular}{lcl} - $\inj \; (a\cdot b)\cdot c \; \Seq \; \ONE \; b$ & $=$ & $(a+e)$\\ - $A$ & $B$ & $C$ - \end{tabular} -\end{center} For instance, the last clause is an injection into a sequence value $v_{i+1}$ whose second child @@ -2111,6 +2103,101 @@ wrapped under the $\Stars$ constructor, giving us $\Stars \; ((\inj \; r \; c \; v) ::vs)$. +Putting all together, Sulzmann and Lu obtained the following algorithm +outlined in the injection-based lexing diagram \ref{graph:inj}: +\begin{center} +\begin{tabular}{lcl} + $\lexer \; r \; [] $ & $=$ & $\textit{if} \; (\nullable \; r)\; \textit{then}\; \Some(\mkeps \; r) \; \textit{else} \; \None$\\ + $\lexer \; r \;c::s$ & $=$ & $\textit{case}\; (\lexer \; (r\backslash c) \; s) \;\textit{of}\; $\\ + & & $\quad \phantom{\mid}\; \None \implies \None$\\ + & & $\quad \mid \Some(v) \implies \Some(\inj \; r\; c\; v)$ +\end{tabular} +\end{center} +\noindent + +\subsection{Examples on How Injection and Lexer Works} +We will provide a few examples on how $\inj$ and $\lexer$ +works by showing their values in each recursive call on some +concrete examples. +We start with the call $\lexer \; (a+aa)^*\cdot c\; aac$ +on the lexer, note that the value's character constructor +$\Char \;c$ is abbreviated as $c$ for readability. +Similarly the last derivative's sub-expression +is abbreviated as $r_{=0}$\footnote{which is equal to +$((\ZERO + (\ZERO a+ \ZERO))\cdot(a+aa)^* + (\ZERO + \ZERO a)\cdot (a+aa)^*)+ +((\ZERO+\ZERO a)\cdot(a+aa)^* + (\ZERO+\ZERO a)\cdot(a+aa)^*), +$.} +whose language interpretation is equivalent to that of +$\ZERO$ and therefore not crucial to be displayed fully expanded, +as they will not be injected into.%our example run. +\begin{center} + \begin{tabular}{lcl} + $(a+\textcolor{magenta}{a}\textcolor{blue}{a})^* \cdot \textcolor{red}{c}$ & $\stackrel{\backslash \textcolor{magenta}{a}}{\rightarrow}$ & $((\ONE + \textcolor{magenta}{\ONE} \textcolor{blue}{a})\cdot (a+aa)^*)\cdot \textcolor{red}{c}+\ZERO$\\ + %& $\stackrel{\rightarrow}{\backslash a}$ & $((\ONE + \ONE a)\cdot (a+aa)^*)\cdot c + \ZERO$\\ + & $\stackrel{\backslash \textcolor{blue}{a}}{\rightarrow}$ & + $(((\ZERO+(\ZERO a+ \textcolor{blue}{\ONE}))\cdot (a+aa)^* + (\ONE+\ONE a)\cdot (a+aa)^* )\cdot \textcolor{red}{\mathbf{c}} + \ZERO)+\ZERO$\\ + & $\stackrel{\backslash \textcolor{red}{c}}{\rightarrow}$ & + $((r_{=0}\cdot c + \textcolor{red}{\ONE})+\ZERO)+\ZERO$\\ + & $\stackrel{\mkeps}{\rightarrow}$ & $\Left (\Left \; (\Right \; \textcolor{red}{\Empty}))$ \\ + & $\stackrel{\inj \;\textcolor{red}{c} }{\rightarrow}$ & + $\Left \; (\Left \; (\Seq \;(\Left \; (\Seq \; (\Right \; (\Right\; \textcolor{blue}{\Empty})) \; \Stars \, [])) \; \textcolor{red}{c}))$\\ + & $\stackrel{\inj \;\textcolor{blue}{a}}{\rightarrow}$ & + $\Left\; (\Seq \; (\Seq\; (\Right \; (\Seq \; \textcolor{magenta}{\Empty }\; \textcolor{blue}{ \mathbf{a}}) ) + \;\Stars \,[]) \; \textcolor{red}{c})$\\ + & $\stackrel{\inj \;\textcolor{magenta}{a}}{\rightarrow}$ & $\Seq \; (\Stars \; [\Right \; (\Seq \; \mathbf{\textcolor{magenta}{a}} \; \textcolor{blue}{a})]) \; \textcolor{red}{ c}$ + + %$\inj \; r \; c\A$ + %$\inj \; (a\cdot b)\cdot c \; \Seq \; \ONE \; b$ & $=$ & $(a+e)$\\ + \end{tabular} +\end{center} +\noindent +We have assigned different colours for each character, +as well as their corresponding locations in values and regular expressions. +The most recently injected character is marked with a bold font. +%Their corresponding derivative steps have been marked with the same +%colour. We also mark the specific location where the coloured character +%was injected into with the same colour, before and after the injection occurred. +%TODO: can be also used to motivate injection->blexer in Bitcoded1 +To show the details of how $\inj$ works, +we zoom in to the second injection above, +illustrating the recursive calls involved: +\begin{center} + \begin{tabular}{lcl} +$\inj \;\quad ((\ONE + {\ONE} + \textcolor{blue}{a})\cdot (a+aa)^*)\cdot + c + \ZERO \; \quad \textcolor{blue}{a} \; $ & &\\ +$\quad \Left \; +(\Left \; (\Seq \;(\Left \; (\Seq \; (\Right \; (\Right\; + \textcolor{blue}{\Empty})) \; \Stars \, [])) \; c))$ & \\$=$\\ + $\Left \; (\inj \; ((\ONE + \ONE + \textcolor{blue}{a})\cdot (a+aa)^*)\cdot + c \quad \textcolor{blue}{a} \quad (\Left \; (\Seq\; ( \Left \; (\Seq \; (\Right \; (\Right\; + \textcolor{blue}{\Empty})) \; \Stars \, []) ) \; c ) )\;\;)$ &\\ $=$\\ + $\Left \; (\Seq \; (\inj \; (\ONE + \ONE \textcolor{blue}{a})\cdot(a+aa)^* \quad \textcolor{blue}{a} \quad + (\Left \; (\Seq \; (\Right \; (\Right\;\textcolor{blue}{\Empty})))) ) \; c ) $ & \\$=$\\ + $\Left \; (\Seq \; (\Seq \; (\inj \quad (\ONE + \ONE \textcolor{blue}{a}) \quad \textcolor{blue}{a}\quad \Right \;(\Right \; \textcolor{blue}{\Empty}) ) \; \Stars \,[])\; c)$ &\\ $=$ \\ + $\Left \; (\Seq \; (\Seq \; (\Right \; (\inj \; \ONE \textcolor{blue}{a} \quad \textcolor{blue}{a}\quad (\Right \;\textcolor{blue}{\Empty})))) \; \Stars \, [])$ + & \\ $=$\\ + $\Left \; (\Seq \; (\Seq \; (\Right \; (\Seq \; (\mkeps \; \ONE)\;(\inj \;\textcolor{blue}{a} \; \textcolor{blue}{a} \; \textcolor{blue}{\Empty})) ) ) \; \Stars \, [] )$ + & \\ $=$\\ + $\Left \; (\Seq \; (\Seq \; (\Right \; (\Seq \; \Empty \; \textcolor{blue}{a}) ) ) \; \Stars \, [] )$ + \end{tabular} +\end{center} +\noindent +We will now introduce the proofs related to $\inj$ and +$\lexer$. These proofs have originally been found by Ausaf et al. +in their 2016 work \cite{AusafDyckhoffUrban2016}. +Proofs to some of these lemmas have also been discussed in more detail in +Ausaf's thesis \cite{Ausaf}. +Nevertheless, we still introduce these proofs, to make this thesis +self-contained as we show inductive variants used in these proofs break +after more simplifications are introduced. +%Also note that lemmas like \ref{}described in this thesis is a more faithful +%representation of the actual accompanying Isabelle formalisation, and +%therefore + + + Recall that lemma \ref{mePosix} tells us that $\mkeps$ always generates the POSIX value. @@ -2212,17 +2299,14 @@ The star case can be proven similarly. \end{proof} \noindent -Putting all together, Sulzmann and Lu obtained the following algorithm -outlined in the diagram \ref{graph:inj}: -\begin{center} -\begin{tabular}{lcl} - $\lexer \; r \; [] $ & $=$ & $\textit{if} \; (\nullable \; r)\; \textit{then}\; \Some(\mkeps \; r) \; \textit{else} \; \None$\\ - $\lexer \; r \;c::s$ & $=$ & $\textit{case}\; (\lexer \; (r\backslash c) \; s) \;\textit{of}\; $\\ - & & $\quad \phantom{\mid}\; \None \implies \None$\\ - & & $\quad \mid \Some(v) \implies \Some(\inj \; r\; c\; v)$ -\end{tabular} -\end{center} -\noindent + + + + + + + +%TODO: Cut from previous lexer def, need to make coherent The central property of the $\lexer$ is that it gives the correct result according to POSIX rules.