--- a/ChengsongTanPhdThesis/Chapters/Cubic.tex Thu Dec 01 08:49:19 2022 +0000
+++ b/ChengsongTanPhdThesis/Chapters/Cubic.tex Mon Dec 05 17:39:10 2022 +0000
@@ -1,6 +1,18 @@
% Chapter Template
-\chapter{A Better Bound and Other Extensions} % Main chapter title
+%We also present the idempotency property proof
+%of $\bsimp$, which leverages the idempotency proof of $\rsimp$.
+%This reinforces our claim that the fixpoint construction
+%originally required by Sulzmann and Lu can be removed in $\blexersimp$.
+%Last but not least, we present our efforts and challenges we met
+%in further improving the algorithm by data
+%structures such as zippers.
+%----------------------------------------------------------------------------------------
+% SECTION strongsimp
+%----------------------------------------------------------------------------------------
+%TODO: search for isabelle proofs of algorithms that check equivalence
+
+\chapter{A Better Size Bound for Derivatives} % Main chapter title
\label{Cubic} %In Chapter 5\ref{Chapter5} we discuss stronger simplifications to improve the finite bound
%in Chapter 4 to a polynomial one, and demonstrate how one can extend the
@@ -11,10 +23,17 @@
This chapter is a ``work-in-progress''
chapter which records
extensions to our $\blexersimp$.
-We intend to formalise this part, which
-we have not been able to finish due to time constraints of the PhD.
+We make a conjecture that the finite
+size bound from the previous chapter can be
+improved to a cubic bound.
+We implemented our conjecture in Scala.
+We intend to formalise this part in Isabelle/HOL at a
+later stage.
+%we have not been able to finish due to time constraints of the PhD.
Nevertheless, we outline the ideas we intend to use for the proof.
+\section{A Stronger Version of Simplification}
+
We present further improvements
for our lexer algorithm $\blexersimp$.
We devise a stronger simplification algorithm,
@@ -24,41 +43,27 @@
even if these regular expressions are not exactly the same.
We call the lexer that uses this stronger simplification function
$\blexerStrong$.
-Unfortunately we did not have time to
-work out the proofs, like in
-the previous chapters.
+%Unfortunately we did not have time to
+%work out the proofs, like in
+%the previous chapters.
We conjecture that both
\begin{center}
$\blexerStrong \;r \; s = \blexer\; r\;s$
\end{center}
-and
+and
\begin{center}
$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
\end{center}
-hold, but a formalisation
-is still future work.
+hold.
+%but a formalisation
+%is still future work.
We give an informal justification
why the correctness and cubic size bound proofs
can be achieved
by exploring the connection between the internal
data structure of our $\blexerStrong$ and
-Animirov's partial derivatives.\\
-%We also present the idempotency property proof
-%of $\bsimp$, which leverages the idempotency proof of $\rsimp$.
-%This reinforces our claim that the fixpoint construction
-%originally required by Sulzmann and Lu can be removed in $\blexersimp$.
+Animirov's partial derivatives.
-%Last but not least, we present our efforts and challenges we met
-%in further improving the algorithm by data
-%structures such as zippers.
-
-
-
-%----------------------------------------------------------------------------------------
-% SECTION strongsimp
-%----------------------------------------------------------------------------------------
-\section{A Stronger Version of Simplification}
-%TODO: search for isabelle proofs of algorithms that check equivalence
In our bitcoded lexing algorithm, (sub)terms represent (sub)matches.
For example, the regular expression
\[
@@ -76,32 +81,35 @@
$rs_a@[a_1]@rs_b@[a_2]@rs_c$
\end{center}
is that
+the two erased regular expressions are equal
\begin{center}
$\rerase{a_1} = \rerase{a_2}$.
\end{center}
-It is characterised as the $LD$
-rewrite rule in figure \ref{rrewriteRules}.\\
-The problem , however, is that identical components
-in two slightly different regular expressions cannot be removed.
-Consider the simplification
+This is characterised as the $LD$
+rewrite rule in figure \ref{rrewriteRules}.
+The problem, however, is that identical components
+in two slightly different regular expressions cannot be removed
+by the $LD$ rule.
+Consider the stronger simplification
\begin{equation}
\label{eqn:partialDedup}
(a+b+d) \cdot r_1 + (a+c+e) \cdot r_1 \stackrel{?}{\rightsquigarrow} (a+b+d) \cdot r_1 + (c+e) \cdot r_1
\end{equation}
-where the $(a+\ldots)\cdot r_1$ is deleted in the right alternative.
+where the $(\underline{a}+c+e)\cdot r_1$ is deleted in the right alternative
+$a+c+e$.
This is permissible because we have $(a+\ldots)\cdot r_1$ in the left
alternative.
The difficulty is that such ``buried''
alternatives-sequences are not easily recognised.
But simplification like this actually
-cannot be omitted,
-as without it the size of derivatives can still
+cannot be omitted, if we want to have a better bound.
+For example, the size of derivatives can still
blow up even with our $\textit{bsimp}$
function:
consider again the example
$\protect((a^* + (aa)^* + \ldots + (\underbrace{a\ldots a}_{n a's})^* )^*)^*$,
-and set $n$ to a relatively small number,
-we get exponential growth:
+and set $n$ to a relatively small number like $n=5$, then we get the following
+exponential growth:
\begin{figure}[H]
\centering
\begin{tikzpicture}
@@ -114,7 +122,7 @@
\addplot[blue,mark=*, mark options={fill=white}] table {bsimpExponential.data};
\end{axis}
\end{tikzpicture}
-\caption{Size of derivatives of $\blexersimp$ for matching
+\caption{Size of derivatives of $\blexersimp$ from chapter 5 for matching
$\protect((a^* + (aa)^* + \ldots + (aaaaa)^* )^*)^*$
with strings
of the form $\protect\underbrace{aa..a}_{n}$.}\label{blexerExp}
@@ -126,10 +134,11 @@
(a+b+d) \cdot r_1 \longrightarrow a \cdot r_1 + b \cdot r_1 + d \cdot r_1
\]
\noindent
-in our $\simp$ function,
-so that it makes the simplification in \eqref{eqn:partialDedup} possible.
-Translating the rule into our $\textit{bsimp}$ function simply
-involves adding a new clause to the $\textit{bsimp}_{ASEQ}$ function:
+which pushes the sequence into the alternatives
+in our $\simp$ function.
+This would then make the simplification shown in \eqref{eqn:partialDedup} possible.
+Translating this rule into our $\textit{bsimp}$ function would simply
+involve adding a new clause to the $\textit{bsimp}_{ASEQ}$ function:
\begin{center}
\begin{tabular}{@{}lcl@{}}
$\textit{bsimp}_{ASEQ} \; bs\; a \; b$ & $\dn$ & $ (a,\; b) \textit{match}$\\
@@ -141,7 +150,7 @@
\end{center}
\noindent
Unfortunately,
-if we introduce them in our
+if we introduce this clause in our
setting we would lose the POSIX property of our calculated values.
For example given the regular expression
\begin{center}
@@ -156,47 +165,53 @@
Essentially it matches the string with the longer Right-alternative
in the first sequence (and
then the 'rest' with the character regular expression $c$ from the second sequence).
-If we add the simplification above, then we obtain the following value
+If we add the simplification above, however,
+then we would obtain the following value
\begin{center}
$\Left \; (\Seq \; a \; (\Left \; bc))$
\end{center}
where the $\Left$-alternatives get priority.
-However this violates the POSIX rules.
+This violates the POSIX rules.
The reason for getting this undesired value
is that the new rule splits this regular expression up into
+a topmost alternative
\begin{center}
$a\cdot(b c + c) + ab \cdot (bc + c)$,
\end{center}
-which becomes a regular expression with a
-quite different structure--the original
-was a sequence, and now it becomes an alternative.
-With an alternative the maximal munch rule no longer works.\\
-A method to reconcile this is to do the
+which is a regular expression with a
+quite different meaning: the original regular expression
+is a sequence, but the simplified regular expression is an alternative.
+With an alternative the maximal munch rule no longer works.
+
+
+A method to reconcile this problem is to do the
transformation in \eqref{eqn:partialDedup} ``non-invasively'',
meaning that we traverse the list of regular expressions
-\begin{center}
- $rs_a@[a]@rs_c$
-\end{center}
-in the alternative
+%\begin{center}
+% $rs_a@[a]@rs_c$
+%\end{center}
+inside alternatives
\begin{center}
$\sum ( rs_a@[a]@rs_c)$
\end{center}
using a function similar to $\distinctBy$,
but this time
-we allow a more general list rewrite:
-\begin{figure}[H]
-\begin{mathpar}
+we allow the following more general rewrite rule:
+\begin{equation}
+\label{eqn:cubicRule}
+%\mbox{
+%\begin{mathpar}
\inferrule * [Right = cubicRule]{\vspace{0mm} }{rs_a@[a]@rs_c
\stackrel{s}{\rightsquigarrow }
rs_a@[\textit{prune} \; a \; rs_a]@rs_c }
-\end{mathpar}
-\caption{The rule capturing the pruning simplification needed to achieve
-a cubic bound}
-\label{fig:cubicRule}
-\end{figure}
+%\end{mathpar}
+%\caption{The rule capturing the pruning simplification needed to achieve
+%a cubic bound}
+\end{equation}
+\noindent
%L \; a_1' = L \; a_1 \setminus (\cup_{a \in rs_a} L \; a)
where $\textit{prune} \;a \; acc$ traverses $a$
-without altering the structure of $a$, removing components in $a$
+without altering the structure of $a$, but removing components in $a$
that have appeared in the accumulator $acc$.
For example
\begin{center}
@@ -208,14 +223,20 @@
\end{center}
because $r_gr_d$ and
$r_hr_d$ are the only terms
-that have not appeared in the accumulator list
+that do not appeared in the accumulator list
\begin{center}
$[(r_a+r_b+r_c)r_d, (r_e+r_f)r_d]$.
\end{center}
-We implemented
-function $\textit{prune}$ in Scala:
-\begin{figure}[H]
-\begin{lstlisting}
+We implemented the
+function $\textit{prune}$ in Scala (see figure \ref{fig:pruneFunc})
+The function $\textit{prune}$
+is a stronger version of $\textit{distinctBy}$.
+It does not just walk through a list looking for exact duplicates,
+but prunes sub-expressions recursively.
+It manages proper contexts by the helper functions
+$\textit{removeSeqTail}$, $\textit{isOne}$ and $\textit{atMostEmpty}$.
+\begin{figure}%[H]
+\begin{lstlisting}[numbers=left]
def prune(r: ARexp, acc: Set[Rexp]) : ARexp = r match{
case AALTS(bs, rs) => rs.map(r => prune(r, acc)).filter(_ != AZERO) match
{
@@ -239,17 +260,16 @@
case r => if(acc(erase(r))) AZERO else r
}
\end{lstlisting}
-\caption{The function $\textit{prune}$ }
+\caption{The function $\textit{prune}$ is called recursively in the
+alternative case (line 2) and in the sequence case (line 12).
+In the alternative case we keep all the accumulators the same, but
+in the sequence case we are making necessary changes to the accumulators
+to allow correct de-duplication.}\label{fig:pruneFunc}
\end{figure}
\noindent
-The function $\textit{prune}$
-is a stronger version of $\textit{distinctBy}$.
-It does not just walk through a list looking for exact duplicates,
-but prunes sub-expressions recursively.
-It manages proper contexts by the helper functions
-$\textit{removeSeqTail}$, $\textit{isOne}$ and $\textit{atMostEmpty}$.
-\begin{figure}[H]
-\begin{lstlisting}
+
+\begin{figure}
+\begin{lstlisting}[numbers=left]
def atMostEmpty(r: Rexp) : Boolean = r match {
case ZERO => true
case ONE => true
@@ -274,17 +294,14 @@
else {
r match {
case SEQ(r1, r2) =>
- if(r2 == tail)
- r1
- else
- ZERO
+ if(r2 == tail) r1 else ZERO
case r => ZERO
}
}
\end{lstlisting}
-\caption{The helper functions of $\textit{prune}$}
+\caption{The helper functions of $\textit{prune}$ XXX.}
\end{figure}
\noindent
Suppose we feed
@@ -295,7 +312,7 @@
\begin{center}
$acc = \{a\cdot(d\cdot e),f\cdot (g \cdot (a \cdot (d \cdot e))) \}$
\end{center}
-as the input for $\textit{prune}$.
+as the input into $\textit{prune}$.
The end result will be
\[
b\cdot(g\cdot(a\cdot(d\cdot e)))
@@ -322,17 +339,17 @@
\textit{removeSeqTail} \quad \;\;
f\cdot(g\cdot (a \cdot(d\cdot e)))\quad \;\; a \cdot(d\cdot e).
\]
+
The idea behind $\textit{removeSeqTail}$ is that
when pruning recursively, we need to ``zoom in''
to sub-expressions, and this ``zoom in'' needs to be performed
on the
-accumulators as well, otherwise we will be comparing
-apples with oranges.
+accumulators as well, otherwise the deletion will not work.
The sub-call
$\textit{prune} \;\; (\ONE+(f+b)\cdot g)\;\; \{\ONE, f\cdot g \}$
is simpler, which will trigger the alternative clause, causing
a pruning on each element in $(\ONE+(f+b)\cdot g)$,
-leaving us $b\cdot g$ only.
+leaving us with $b\cdot g$ only.
Our new lexer with stronger simplification
uses $\textit{prune}$ by making it
@@ -342,7 +359,7 @@
parts of a regular expression are pruned away.
\begin{figure}[H]
-\begin{lstlisting}
+\begin{lstlisting}[numbers=left]
def turnIntoTerms(r: Rexp): List[Rexp] = r match {
case SEQ(r1, r2) =>
turnIntoTerms(r1).flatMap(r11 => furtherSEQ(r11, r2))
@@ -370,7 +387,7 @@
}
\end{lstlisting}
-\caption{A Stronger Version of $\textit{distinctBy}$}
+\caption{A Stronger Version of $\textit{distinctBy}$ XXX}
\end{figure}
\noindent
Once a regular expression has been pruned,
@@ -506,11 +523,16 @@
\end{figure}
\noindent
We call this lexer $\blexerStrong$.
-This version is able to drastically reduce the
-internal data structure size which
-otherwise could
-trigger exponential behaviours in
+This version is able to reduce the
+size of the derivatives which
+otherwise
+triggered exponential behaviour in
$\blexersimp$.
+Consider again the runtime for matching
+$\protect((a^* + (aa)^* + \ldots + (aaaaa)^* )^*)^*$ with strings
+of the form $\protect\underbrace{aa..a}_{n}$.
+They produce the folloiwng graphs ($\blexerStrong$ on the left-hand-side and
+$\blexersimp$ on the right-hand-side).
\begin{figure}[H]
\centering
\begin{tabular}{@{}c@{\hspace{0mm}}c@{\hspace{0mm}}c@{}}
@@ -541,13 +563,14 @@
\end{tikzpicture}\\
\multicolumn{2}{l}{}
\end{tabular}
-\caption{Runtime for matching
- $\protect((a^* + (aa)^* + \ldots + (aaaaa)^* )^*)^*$ with strings
- of the form $\protect\underbrace{aa..a}_{n}$.}\label{fig:aaaaaStarStar}
+\caption{}\label{fig:aaaaaStarStar}
\end{figure}
\noindent
-We would like to preserve the correctness like the one
-we had for $\blexersimp$:
+We hope the correctness is preserved.
+%We would like to preserve the correctness like the one
+%we had for $\blexersimp$:
+The proof idea is to preserve the key lemma in chapter \ref{Bitcoded2}
+such as in equation \eqref{eqn:cubicRule}.
\begin{conjecture}\label{cubicConjecture}
$\blexerStrong \;r \; s = \blexer\; r\;s$
\end{conjecture}
@@ -556,7 +579,7 @@
chapter \ref{Bitcoded2} like
$r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$
with the new rewriting rule
-shown in figure \ref{fig:cubicRule} .
+shown in figure \eqref{eqn:cubicRule} .
In the next sub-section,
we will describe why we
@@ -571,18 +594,18 @@
\subsection{Antimirov's partial derivatives}
Partial derivatives were first introduced by
Antimirov \cite{Antimirov95}.
-Partial derivatives are very similar
+They are very similar
to Brzozowski derivatives,
-but splits children of alternative regular expressions into
+but split children of alternative regular expressions into
multiple independent terms. This means the output of
-partial derivatives become a
-set of regular expressions:
+partial derivatives is a
+set of regular expressions, defined as follows
\begin{center}
- \begin{tabular}{lcl}
+ \begin{tabular}{lcl@{\hspace{-5mm}}l}
$\partial_x \; (r_1 \cdot r_2)$ &
$\dn$ & $(\partial_x \; r_1) \cdot r_2 \cup
- \partial_x \; r_2 \; \textit{if} \; \; \nullable\; r_1$\\
- & & $(\partial_x \; r_1)\cdot r_2 \quad\quad
+ \partial_x \; r_2 \;$ & $ \textit{if} \; \; \nullable\; r_1$\\
+ & & $(\partial_x \; r_1)\cdot r_2 \quad\quad$ & $
\textit{otherwise}$\\
$\partial_x \; r^*$ & $\dn$ & $(\partial_x \; r) \cdot r^*$\\
$\partial_x \; c $ & $\dn$ & $\textit{if} \; x = c \;
@@ -594,7 +617,7 @@
\end{tabular}
\end{center}
\noindent
-The $\cdot$ between for example
+The $\cdot$ in the example
$(\partial_x \; r_1) \cdot r_2 $
is a shorthand notation for the cartesian product
$(\partial_x \; r_1) \times \{ r_2\}$.
@@ -604,9 +627,10 @@
Rather than joining the calculated derivatives $\partial_x r_1$ and $\partial_x r_2$ together
using the $\sum$ constructor, Antimirov put them into
a set. This means many subterms will be de-duplicated
-because they are sets,
-For example, to compute what regular expression $x^*(xx + y)^*$'s
-derivative w.r.t. $x$ is, one can compute a partial derivative
+because they are sets.
+For example, to compute what the derivative of the regular expression
+$x^*(xx + y)^*$
+w.r.t. $x$ is, one can compute a partial derivative
and get two singleton sets $\{x^* \cdot (xx + y)^*\}$ and $\{x \cdot (xx + y) ^* \}$
from $\partial_x(x^*) \cdot (xx + y) ^*$ and $\partial_x((xx + y)^*)$.
@@ -615,47 +639,51 @@
\partial_{c::cs} r \dn \bigcup_{r'\in (\partial_c r)}
\partial_{cs} r'
\]
-Given an alphabet $\Sigma$, we denote the set of all possible strings
-from this alphabet as $\Sigma^*$.
-The set of all possible partial derivatives is defined
-as the union of derivatives w.r.t all the strings in the universe:
+Suppose an alphabet $\Sigma$, we use $\Sigma^*$ for the set of all possible strings
+from the alphabet.
+The set of all possible partial derivatives is then defined
+as the union of derivatives w.r.t all the strings:
\begin{center}
\begin{tabular}{lcl}
$\textit{PDER}_{\Sigma^*} \; r $ & $\dn $ & $\bigcup_{w \in \Sigma^*}\partial_w \; r$
\end{tabular}
\end{center}
\noindent
-Consider now again our pathological case where the derivatives
-grow with a rather aggressive simplification
+Consider now again our pathological case where we apply the more
+aggressive
+simplification
\begin{center}
$((a^* + (aa)^* + \ldots + (\underbrace{a\ldots a}_{n a's})^* )^*)^*$
\end{center}
-example, if we denote this regular expression as $r$,
-we have that
+let use abbreviate theis regular expression with $r$,
+then we have that
\begin{center}
$\textit{PDER}_{\Sigma^*} \; r =
\bigcup_{i=1}^{n}\bigcup_{j=0}^{i-1} \{
(\underbrace{a \ldots a}_{\text{j a's}}\cdot
(\underbrace{a \ldots a}_{\text{i a's}})^*)\cdot r \}$,
\end{center}
-with exactly $n * (n + 1) / 2$ terms.
-This is in line with our speculation that only $n*(n+1)/2$ terms are
-needed. We conjecture that $\bsimpStrong$ is also able to achieve this
+The union on the right-hand-side has $n * (n + 1) / 2$ terms.
+This leads us to believe that the maximum number of terms needed
+in our derivative is also only $n * (n + 1) / 2$.
+Therefore
+we conjecture that $\bsimpStrong$ is also able to achieve this
upper limit in general
\begin{conjecture}\label{bsimpStrongInclusionPder}
- Using a suitable transformation $f$, we have
+ Using a suitable transformation $f$, we have that
\begin{center}
$\forall s.\; f \; (r \bdersStrong \; s) \subseteq
\textit{PDER}_{\Sigma^*} \; r$
\end{center}
+ holds.
\end{conjecture}
\noindent
-because our \ref{fig:cubicRule} will keep only one copy of each term,
+The reason is that our \eqref{eqn:cubicRule} will keep only one copy of each term,
where the function $\textit{prune}$ takes care of maintaining
a set like structure similar to partial derivatives.
-We might need to adjust $\textit{prune}$
-slightly to make sure all duplicate terms are eliminated,
-which should be doable.
+%We might need to adjust $\textit{prune}$
+%slightly to make sure all duplicate terms are eliminated,
+%which should be doable.
Antimirov had proven that the sum of all the partial derivative
terms' sizes is bounded by the cubic of the size of that regular
@@ -663,10 +691,11 @@
\begin{property}\label{pderBound}
$\llbracket \textit{PDER}_{\Sigma^*} \; r \rrbracket \leq O(\llbracket r \rrbracket^3)$
\end{property}
+\noindent
This property was formalised by Wu et al. \cite{Wu2014}, and the
-details can be found in the archive of formal proofs. \footnote{https://www.isa-afp.org/entries/Myhill-Nerode.html}
+details can be found in the Archive of Formal Froofs\footnote{https://www.isa-afp.org/entries/Myhill-Nerode.html}.
Once conjecture \ref{bsimpStrongInclusionPder} is proven, then property \ref{pderBound}
-would yield us also a cubic bound for our $\blexerStrong$ algorithm:
+would provide us with a cubic bound for our $\blexerStrong$ algorithm:
\begin{conjecture}\label{strongCubic}
$\llbracket r \bdersStrong\; s \rrbracket \leq \llbracket r \rrbracket^3$
\end{conjecture}
--- a/ChengsongTanPhdThesis/Chapters/Introduction.tex Thu Dec 01 08:49:19 2022 +0000
+++ b/ChengsongTanPhdThesis/Chapters/Introduction.tex Mon Dec 05 17:39:10 2022 +0000
@@ -265,6 +265,7 @@
requests. This made the whole site become unavailable.
\begin{figure}[p]
+\begin{center}
\begin{tabular}{@{}c@{\hspace{0mm}}c@{}}
\begin{tikzpicture}
\begin{axis}[
@@ -393,6 +394,7 @@
\end{tikzpicture}\\
\multicolumn{2}{c}{Graphs}
\end{tabular}
+\end{center}
\caption{Graphs showing runtime for matching $(a^*)^*\,b$ with strings
of the form $\protect\underbrace{aa..a}_{n}$ in various existing regular expression libraries.
The reason for their superlinear behaviour is that they do a depth-first-search
@@ -444,7 +446,7 @@
w.r.t the input.
This assumes that the regular expression
$r$ was pre-processed and turned into a
-deterministic finite automaton (DFA) before matching\cite{Sakarovitch2009}.
+deterministic finite automaton (DFA) before matching~\cite{Sakarovitch2009}.
By basic we mean textbook definitions such as the one
below, involving only regular expressions for characters, alternatives,
sequences, and Kleene stars:
@@ -533,7 +535,7 @@
algorithms or algorithms that consume large amounts of memory.
Implementations using $\DFA$s will
either become excruciatingly slow
-(for example Verbatim++\cite{Verbatimpp}) or get
+(for example Verbatim++~\cite{Verbatimpp}) or get
out of memory errors (for example $\mathit{LEX}$ and
$\mathit{JFLEX}$\footnote{which are lexer generators
in C and JAVA that generate $\mathit{DFA}$-based
@@ -633,7 +635,7 @@
can be above 1000, and in the built-in Rust regular expression library
expressions such as $a^{\{1000\}\{100\}\{5\}}$ give an error message
for being too big.
-As Becchi and Crawley\cite{Becchi08} have pointed out,
+As Becchi and Crawley~\cite{Becchi08} have pointed out,
the reason for these restrictions
is that they simulate a non-deterministic finite
automata (NFA) with a breadth-first search.
@@ -815,7 +817,7 @@
Back-references is a regex construct
that programmers find quite useful.
-According to Becchi and Crawley\cite{Becchi08},
+According to Becchi and Crawley~\cite{Becchi08},
6\% of Snort rules (up until 2008) use them.
The most common use of back-references
is to express well-formed html files,
@@ -1202,11 +1204,11 @@
One of the most recent work in the context of lexing
%with this issue
-is the Verbatim lexer by Egolf, Lasser and Fisher\cite{Verbatim}.
+is the Verbatim lexer by Egolf, Lasser and Fisher~\cite{Verbatim}.
This is relevant work for us and we will compare it later with
our derivative-based matcher we are going to present.
There is also some newer work called
-Verbatim++\cite{Verbatimpp}, which does not use derivatives,
+Verbatim++~\cite{Verbatimpp}, which does not use derivatives,
but deterministic finite automaton instead.
%An example that gives problem to automaton approaches would be
%the regular expression $(a|b)^*a(a|b)^{\{n\}}$.