lexing: comparison ChengsongTanPhdThesis/Chapters/Cubic.tex

equal deleted inserted replaced

-:bc1df466150a
+:8c234a1bc7e0
 $(r_1 + r_2) \cdot r_3 \longrightarrow (r_1 \cdot r_3) + (r_2 \cdot r_3)$,
-\section{The NTIMES Constructor, and the Size Bound and Correctness Proof for it}
-The NTIMES construct has the following closed form:
-\begin{verbatim}
-"rders_simp (RNTIMES r0 (Suc n)) (c#s) =
-rsimp ( RALTS ( (map (optermsimp r0 ) (nupdates s r0 [Some ([c], n)]) ) ))"
-\end{verbatim}
-%----------------------------------------------------------------------------------------
-%	SECTION 1
-%----------------------------------------------------------------------------------------
-\section{Adding Support for the Negation Construct, and its Correctness Proof}
-We now add support for the negation regular expression:
-\[			r ::=   \ZERO \mid  \ONE
-			 \mid  c
-			 \mid  r_1 \cdot r_2
-			 \mid  r_1 + r_2
-			 \mid r^*
-			 \mid \sim r
-\]
-The $\textit{nullable}$ function's clause for it would be
-\[
-\textit{nullable}(~r) = \neg \nullable(r)
-\]
-The derivative would be
-\[
-~r \backslash c = ~ (r \backslash c)
-\]
-The most tricky part of lexing for the $~r$ regular expression
-is creating a value for it.
-For other regular expressions, the value aligns with the
-structure of the regular expression:
-\[
-\vdash \Seq(\Char(a), \Char(b)) : a \cdot b
-\]
-But for the $~r$ regular expression, $s$ is a member of it if and only if
-$s$ does not belong to $L(r)$.
-That means when there
-is a match for the not regular expression, it is not possible to generate how the string $s$ matched
-with $r$.
-What we can do is preserve the information of how $s$ was not matched by $r$,
-and there are a number of options to do this.
-We could give a partial value when there is a partial match for the regular expression inside
-the $\mathbf{not}$ construct.
-For example, the string $ab$ is not in the language of $(a\cdot b) \cdot c$,
-A value for it could be
-\[
-\vdash \textit{Not}(\Seq(\Char(a), \Char(b))) : ~((a \cdot b ) \cdot c)
-\]
-The above example demonstrates what value to construct
-when the string $s$ is at most a real prefix
-of the strings in $L(r)$. When $s$ instead is not a prefix of any strings
-in $L(r)$, it becomes unclear what to return as a value inside the $\textit{Not}$
-constructor.
-Another option would be to either store the string $s$ that resulted in
-a mis-match for $r$ or a dummy value as a placeholder:
-\[
-\vdash \textit{Not}(abcd) : ~( r_1 )
-\]
-or
-\[
-\vdash \textit{Not}(\textit{Dummy}) : ~( r_1 )
-\]
-We choose to implement this as it is most straightforward:
-\[
-\mkeps(~(r))  = \textit{if}(\nullable(r)) \; \textit{Error} \; \textit{else} \; \textit{Not}(\textit{Dummy})
-\]
 %----------------------------------------------------------------------------------------
 %	SECTION 2
 %----------------------------------------------------------------------------------------
 \section{Bounded Repetitions}
-We define for the $r^{n}$ constructor something similar to $\starupdate$
+We have promised in chapter \ref{Introduction}
+that our lexing algorithm can potentially be extended
+to handle bounded repetitions
+in natural and elegant ways.
+Now we fulfill our promise by adding support for
+the ``exactly-$n$-times'' bounded regular expression $r^{\{n\}}$.
+We add clauses in our derivatives-based lexing algorithms (with simplifications)
+introduced in chapter \ref{Bitcoded2}.
+\subsection{Augmented Definitions}
+There are a number of definitions that needs to be augmented.
+The most notable one would be the POSIX rules for $r^{\{n\}}$:
+\begin{mathpar}
+	\inferrule{v \in vs}{\textit{Stars} place holder}
+\end{mathpar}
+\subsection{Proofs for the Augmented Lexing Algorithm}
+We need to maintain two proofs with the additional $r^{\{n\}}$
+construct: the
+correctness proof in chapter \ref{Bitcoded2},
+and the finiteness proof in chapter \ref{Finite}.
+\subsubsection{Correctness Proof Augmentation}
+The correctness of $\textit{lexer}$ and $\textit{blexer}$ with bounded repetitions
+have been proven by Ausaf and Urban\cite{AusafDyckhoffUrban2016}.
+As they have commented, once the definitions are in place,
+the proofs given for the basic regular expressions will extend to
+bounded regular expressions, and there are no ``surprises''.
+We confirm this point because the correctness theorem would also
+extend without surprise to $\blexersimp$.
+The rewrite rules such as $\rightsquigarrow$, $\stackrel{s}{\rightsquigarrow}$ and so on
+do not need to be changed,
+and only a few lemmas such as lemma \ref{fltsPreserves} need to be adjusted to
+add one more line which can be solved by sledgehammer
+to solve the $r^{\{n\}}$ inductive case.
+\subsubsection{Finiteness Proof Augmentation}
+The bounded repetitions can be handled
+using techniques similar to stars.
+The closed form for them looks like:
+%\begin{center}
+%	\begin{tabular}{llrclll}
+%		$r^{\{n+1\}}$ & $ \backslash_{rsimps}$ & $(c::s)$ & $=$ & & \\
+%		$\textit{rsimp}$ & $($ & $
+%		\sum \; ( $ & $\map$ & $(\textit{optermsimp}\;r)$ & $($\\
+%			 & & & & $\textit{nupdates} \;$ &
+%			 $ s \; r_0 \; [ \textit{Some} \; ([c], n)]$\\
+%			 & & & & $)$ &\\
+%			 & &  $)$ & & &\\
+%			 & $)$ & & & &\\
+%	\end{tabular}
+%\end{center}
+\begin{center}
+	\begin{tabular}{llrcllrllll}
+		$r^{\{n+1\}}$ & $ \backslash_{rsimps}$ & $(c::s)$ & $=$ & & &&&&\\
+			      &&&&$\textit{rsimp}$ & $($ & $
+			      \sum \; ( $ & $\map$ & $(\textit{optermsimp}\;r)$ & $($\\
+					  &&&& & & & & $\;\; \textit{nupdates} \;$ &
+			 		  $ s \; r_0 \; [ \textit{Some} \; ([c], n)]$\\
+					  &&&& & & & & $)$ &\\
+					  &&&& & &  $)$ & & &\\
+					  &&&& & $)$ & & & &\\
+	\end{tabular}
+\end{center}
+We define for the $r^{\{n\}}$ constructor something similar to $\starupdate$
 and $\starupdates$:
 \begin{center}
 	\begin{tabular}{lcl}
 		$\starupdate \; c \; r \; [] $ & $\dn$ & $[]$\\
 		$\starupdate \; c \; r \; (s :: Ss)$ & $\dn$ & \\
 		\starupdate \; c \; r \; Ss)$
 	\end{tabular}
 \end{center}
 \noindent
+%----------------------------------------------------------------------------------------
+%	SECTION 1
+%----------------------------------------------------------------------------------------
+%\section{Adding Support for the Negation Construct, and its Correctness Proof}
+%We now add support for the negation regular expression:
+%\[			r ::=   \ZERO \mid  \ONE
+%			 \mid  c
+%			 \mid  r_1 \cdot r_2
+%			 \mid  r_1 + r_2
+%			 \mid r^*
+%			 \mid \sim r
+%\]
+%The $\textit{nullable}$ function's clause for it would be
+%\[
+%\textit{nullable}(~r) = \neg \nullable(r)
+%\]
+%The derivative would be
+%\[
+%~r \backslash c = ~ (r \backslash c)
+%\]
+%
+%The most tricky part of lexing for the $~r$ regular expression
+% is creating a value for it.
+% For other regular expressions, the value aligns with the
+% structure of the regular expression:
+% \[
+% \vdash \Seq(\Char(a), \Char(b)) : a \cdot b
+% \]
+%But for the $~r$ regular expression, $s$ is a member of it if and only if
+%$s$ does not belong to $L(r)$.
+%That means when there
+%is a match for the not regular expression, it is not possible to generate how the string $s$ matched
+%with $r$.
+%What we can do is preserve the information of how $s$ was not matched by $r$,
+%and there are a number of options to do this.
+%
+%We could give a partial value when there is a partial match for the regular expression inside
+%the $\mathbf{not}$ construct.
+%For example, the string $ab$ is not in the language of $(a\cdot b) \cdot c$,
+%A value for it could be
+% \[
+% \vdash \textit{Not}(\Seq(\Char(a), \Char(b))) : ~((a \cdot b ) \cdot c)
+% \]
+% The above example demonstrates what value to construct
+% when the string $s$ is at most a real prefix
+% of the strings in $L(r)$. When $s$ instead is not a prefix of any strings
+% in $L(r)$, it becomes unclear what to return as a value inside the $\textit{Not}$
+% constructor.
+%
+% Another option would be to either store the string $s$ that resulted in
+% a mis-match for $r$ or a dummy value as a placeholder:
+% \[
+% \vdash \textit{Not}(abcd) : ~( r_1 )
+% \]
+%or
+% \[
+% \vdash \textit{Not}(\textit{Dummy}) : ~( r_1 )
+% \]
+% We choose to implement this as it is most straightforward:
+% \[
+% \mkeps(~(r))  = \textit{if}(\nullable(r)) \; \textit{Error} \; \textit{else} \; \textit{Not}(\textit{Dummy})
+% \]
+%
+%

changeset 612	8c234a1bc7e0
parent 596	b306628a0eab
child 613	b0f0d884a547