lexing: comparison ChengsongTanPhdThesis/Chapters/Cubic.tex

equal deleted inserted replaced

-:94db2636a296
+:7af4e2420a8c
 We intend to formalise this part, which
 we have not been able to finish due to time constraints of the PhD.
 Nevertheless, we outline the ideas we intend to use for the proof.
 We present further improvements
-made to our lexer algorithm $\blexersimp$.
+for our lexer algorithm $\blexersimp$.
 We devise a stronger simplification algorithm,
 called $\bsimpStrong$, which can prune away
 similar components in two regular expressions at the same
 alternative level,
 even if these regular expressions are not exactly the same.
 We call the lexer that uses this stronger simplification function
 $\blexerStrong$.
+Unfortunately we did not have time to
+work out the proofs, like in
+the previous chapters.
 We conjecture that both
 \begin{center}
 	$\blexerStrong \;r \; s = \blexer\; r\;s$
 \end{center}
 and
 \begin{center}
 	$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
 \end{center}
-hold, but formalising
+hold, but a formalisation
-them is still work in progress.
+is still future work.
-We give reasons why the correctness and cubic size bound proofs
+We give an informal justification
-can be achieved,
+why the correctness and cubic size bound proofs
+can be achieved
 by exploring the connection between the internal
 data structure of our $\blexerStrong$ and
 Animirov's partial derivatives.\\
 %We also present the idempotency property proof
 %of $\bsimp$, which leverages the idempotency proof of $\rsimp$.
 For example, the regular expression
 \[
 	aa \cdot a^*+ a \cdot a^* + aa\cdot a^*
 \]
 contains three terms,
-expressing three possibilities it will match future input.
+expressing three possibilities for how it can match some input.
 The first and the third terms are identical, which means we can eliminate
-the latter as we know it will not be picked up by $\bmkeps$.
+the latter as it will not contribute to a POSIX value.
-In $\bsimps$, the $\distinctBy$ function takes care of this.
+In $\bsimps$, the $\distinctBy$ function takes care of
+such instances.
 The criteria $\distinctBy$ uses for removing a duplicate
 $a_2$ in the list
 \begin{center}
 	$rs_a@[a_1]@rs_b@[a_2]@rs_c$
 \end{center}
 is that
 \begin{center}
 	$\rerase{a_1} = \rerase{a_2}$.
 \end{center}
-It can be characterised as the $LD$
+It is characterised as the $LD$
-rewrite rule in \ref{rrewriteRules}.\\
+rewrite rule in figure \ref{rrewriteRules}.\\
 The problem , however, is that identical components
-in two slightly different regular expressions cannot be removed:
+in two slightly different regular expressions cannot be removed.
-\begin{figure}[H]
+Consider the simplification
-\[
+\begin{equation}
+	\label{eqn:partialDedup}
 	(a+b+d) \cdot r_1 + (a+c+e) \cdot r_1 \stackrel{?}{\rightsquigarrow} (a+b+d) \cdot r_1 + (c+e) \cdot r_1
-\]
+\end{equation}
-\caption{Desired simplification, but not done in $\blexersimp$}\label{partialDedup}
+where the $(a+\ldots)\cdot r_1$ is deleted in the right alternative.
-\end{figure}
+This is permissible because we have $(a+\ldots)\cdot r_1$ in the left
-\noindent
+alternative.
-A simplification like this actually
+The difficulty is that such  ``buried''
+alternatives-sequences are not easily recognised.
+But simplification like this actually
 cannot be omitted,
-as without it the size could blow up even with our $\textit{bsimp}$
+as without it the size of derivatives can still
-function: for the chapter \ref{Finite} example
+blow up even with our $\textit{bsimp}$
+function:
+consider again the example
 $\protect((a^* + (aa)^* + \ldots + (\underbrace{a\ldots a}_{n a's})^* )^*)^*$,
-by just setting n to a small number,
+and set $n$ to a relatively small number,
-we get exponential growth that does not stop before it becomes huge:
+we get exponential growth:
 \begin{figure}[H]
 \centering
 \begin{tikzpicture}
 \begin{axis}[
 %xlabel={$n$},
 ylabel={size},
 ]
 \addplot[blue,mark=*, mark options={fill=white}] table {bsimpExponential.data};
 \end{axis}
 \end{tikzpicture}
-\caption{Runtime of $\blexersimp$ for matching
+\caption{Size of derivatives of $\blexersimp$ for matching
 	$\protect((a^* + (aa)^* + \ldots + (aaaaa)^* )^*)^*$
 	with strings
 	of the form $\protect\underbrace{aa..a}_{n}$.}\label{blexerExp}
 \end{figure}
 \noindent
-We would like to apply the rewriting at some stage
+One possible approach would be to apply the rewriting
-\begin{figure}[H]
+rule
 \[
 	(a+b+d) \cdot r_1  \longrightarrow a \cdot r_1 + b \cdot r_1 + d \cdot r_1
 \]
-\caption{Desired simplification, but not done in $\blexersimp$}\label{desiredSimp}
-\end{figure}
 \noindent
 in our $\simp$ function,
-so that it makes the simplification in \ref{partialDedup} possible.
+so that it makes the simplification in \eqref{eqn:partialDedup} possible.
 Translating the rule into our $\textit{bsimp}$ function simply
 involves adding a new clause to the $\textit{bsimp}_{ASEQ}$ function:
 \begin{center}
 	\begin{tabular}{@{}lcl@{}}
 		$\textit{bsimp}_{ASEQ} \; bs\; a \; b$ & $\dn$ & $ (a,\; b) \textit{match}$\\
 &&$\quad\textit{case} \; (_{bs1}\sum as, a_2') \Rightarrow _{bs1}\sum (
 \map \; (_{[]}\textit{ASEQ} \; \_ \; a_2') \; as)$\\
 &&$\quad\textit{case} \; (a_1', a_2') \Rightarrow   _{bs}a_1' \cdot a_2'$ \\
 	\end{tabular}
 \end{center}
+\noindent
 Unfortunately,
 if we introduce them in our
 setting we would lose the POSIX property of our calculated values.
 For example given the regular expression
 \begin{center}
 	$(a + ab)(bc + c)$
 \end{center}
-and the string
+and the string $ab$,
-\begin{center}
-	$ab$,
-\end{center}
 then our algorithm generates the following
 correct POSIX value
 \begin{center}
 	$\Seq \; (\Right \; ab) \; (\Right \; c)$.
 \end{center}
 is that the new rule splits this regular expression up into
 \begin{center}
 	$a\cdot(b c + c) + ab \cdot (bc + c)$,
 \end{center}
 which becomes a regular expression with a
-totally different structure--the original
+quite different structure--the original
 was a sequence, and now it becomes an alternative.
-With an alternative the maximum munch rule no longer works.\\
+With an alternative the maximal munch rule no longer works.\\
 A method to reconcile this is to do the
-transformation in \ref{desiredSimp} ``non-invasively'',
+transformation in \eqref{eqn:partialDedup} ``non-invasively'',
 meaning that we traverse the list of regular expressions
 \begin{center}
 	$rs_a@[a]@rs_c$
 \end{center}
 in the alternative
 	$\sum ( rs_a@[a]@rs_c)$
 \end{center}
 using  a function similar to $\distinctBy$,
 but this time
 we allow a more general list rewrite:
-\begin{mathpar}\label{cubicRule}
+\begin{figure}[H]
+\begin{mathpar}
 	\inferrule * [Right = cubicRule]{\vspace{0mm} }{rs_a@[a]@rs_c
 			\stackrel{s}{\rightsquigarrow }
 		rs_a@[\textit{prune} \; a \; rs_a]@rs_c }
 \end{mathpar}
+\caption{The rule capturing the pruning simplification needed to achieve
+a cubic bound}
+\label{fig:cubicRule}
+\end{figure}
 %L \; a_1' = L \; a_1 \setminus (\cup_{a \in rs_a} L \; a)
 where $\textit{prune} \;a \; acc$ traverses $a$
 without altering the structure of $a$, removing components in $a$
 that have appeared in the accumulator $acc$.
 For example
 that have not appeared in the accumulator list
 \begin{center}
 $[(r_a+r_b+r_c)r_d, (r_e+r_f)r_d]$.
 \end{center}
 We implemented
-function $\textit{prune}$ in Scala,
+function $\textit{prune}$ in Scala:
-and incorporated into our lexer,
-by replacing the $\simp$ function
-with a stronger version called $\bsimpStrong$
-that prunes regular expressions.
 \begin{figure}[H]
 \begin{lstlisting}
-def atMostEmpty(r: Rexp) : Boolean = r match {
-case ZERO => true
-case ONE => true
-case STAR(r) => atMostEmpty(r)
-case SEQ(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
-case ALTS(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
-case CHAR(_) => false
-}
-def isOne(r: Rexp) : Boolean = r match {
-case ONE => true
-case SEQ(r1, r2) => isOne(r1) && isOne(r2)
-case ALTS(r1, r2) => (isOne(r1) || isOne(r2)) && (atMostEmpty(r1) && atMostEmpty(r2))//rs.forall(atMostEmpty) && rs.exists(isOne)
-case STAR(r0) => atMostEmpty(r0)
-case CHAR(c) => false
-case ZERO => false
-}
-//r = r' ~ tail' : If tail' matches tail => returns r'
-def removeSeqTail(r: Rexp, tail: Rexp) : Rexp = r match {
-case SEQ(r1, r2) =>
-if(r2 == tail)
-r1
-else
-ZERO
-case r => ZERO
-}
 def prune(r: ARexp, acc: Set[Rexp]) : ARexp = r match{
-case AALTS(bs, rs) => rs.map(r => prune(r, acc)).filter(_ != ZERO) match
+case AALTS(bs, rs) => rs.map(r => prune(r, acc)).filter(_ != AZERO) match
 {
 //all components have been removed, meaning this is effectively a duplicate
 //flats will take care of removing this AZERO
 case Nil => AZERO
 case r::Nil => fuse(bs, r)
 }
 //this does the duplicate component removal task
 case r => if(acc(erase(r))) AZERO else r
 }
 \end{lstlisting}
-\caption{pruning function together with its helper functions}
+\caption{The function $\textit{prune}$ }
 \end{figure}
 \noindent
-The benefits of using
+The function $\textit{prune}$
-$\textit{prune}$ such as refining the finiteness bound
+is a stronger version of $\textit{distinctBy}$.
-to a cubic bound has not been formalised yet.
+It does not just walk through a list looking for exact duplicates,
-Therefore we choose to use Scala code rather than an Isabelle-style formal
+but prunes sub-expressions recursively.
-definition like we did for $\simp$, as the definitions might change
+It manages proper contexts by the helper functions
-to suit proof needs.
+$\textit{removeSeqTail}$, $\textit{isOne}$ and $\textit{atMostEmpty}$.
-In the rest of the chapter we will use this convention consistently.
 \begin{figure}[H]
 \begin{lstlisting}
-def distinctWith(rs: List[ARexp],
+def atMostEmpty(r: Rexp) : Boolean = r match {
-pruneFunction: (ARexp, Set[Rexp]) => ARexp,
+case ZERO => true
-acc: Set[Rexp] = Set()) : List[ARexp] =
+case ONE => true
-rs match{
+case STAR(r) => atMostEmpty(r)
-case Nil => Nil
+case SEQ(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
-case r :: rs =>
+case ALTS(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
-if(acc(erase(r)))
+case CHAR(_) => false
-distinctWith(rs, pruneFunction, acc)
+}
-else {
-val pruned_r = pruneFunction(r, acc)
+def isOne(r: Rexp) : Boolean = r match {
-pruned_r ::
+case ONE => true
-distinctWith(rs,
+case SEQ(r1, r2) => isOne(r1) && isOne(r2)
-pruneFunction,
+case ALTS(r1, r2) => (isOne(r1) || isOne(r2)) && (atMostEmpty(r1) && atMostEmpty(r2))
-turnIntoTerms(erase(pruned_r)) ++: acc
+case STAR(r0) => atMostEmpty(r0)
-)
+case CHAR(c) => false
-}
+case ZERO => false
+}
+def removeSeqTail(r: Rexp, tail: Rexp) : Rexp =
+if (r == tail)
+ONE
+else {
+r match {
+case SEQ(r1, r2) =>
+if(r2 == tail)
+r1
+else
+ZERO
+case r => ZERO
+}
 }
+\end{lstlisting}
+\caption{The helper functions of $\textit{prune}$}
+\end{figure}
+\noindent
+Suppose we feed
+\begin{center}
+	$r= (\underline{\ONE}+(\underline{f}+b)\cdot g)\cdot (a\cdot(d\cdot e))$
+\end{center}
+and
+\begin{center}
+	$acc = \{a\cdot(d\cdot e),f\cdot (g \cdot (a \cdot (d \cdot e))) \}$
+\end{center}
+as the input for $\textit{prune}$.
+The end result will be
+\[
+	b\cdot(g\cdot(a\cdot(d\cdot e)))
+\]
+where the underlined components in $r$ are eliminated.
+Looking more closely, at the topmost call
+\[
+	\textit{prune} \quad (\ONE+
+	(f+b)\cdot g)\cdot (a\cdot(d\cdot e)) \quad
+	\{a\cdot(d\cdot e),f\cdot (g \cdot (a \cdot (d \cdot e))) \}
+\]
+The sequence clause will be called,
+where a sub-call
+\[
+	\textit{prune} \;\; (\ONE+(f+b)\cdot g)\;\; \{\ONE, f\cdot g \}
+\]
+is made. The terms in the new accumulator $\{\ONE,\; f\cdot g \}$ come from
+the two calls to $\textit{removeSeqTail}$:
+\[
+	\textit{removeSeqTail} \quad\;\; a \cdot(d\cdot e) \quad\;\; a \cdot(d\cdot e)
+\]
+and
+\[
+	\textit{removeSeqTail} \quad \;\;
+	f\cdot(g\cdot (a \cdot(d\cdot e)))\quad  \;\; a \cdot(d\cdot e).
+\]
+The idea behind $\textit{removeSeqTail}$ is that
+when pruning recursively, we need to ``zoom in''
+to sub-expressions, and this ``zoom in'' needs to be performed
+on the
+accumulators as well, otherwise we will be comparing
+apples with oranges.
+The sub-call
+$\textit{prune} \;\; (\ONE+(f+b)\cdot g)\;\; \{\ONE, f\cdot g \}$
+is simpler, which will trigger the alternative clause, causing
+a pruning on each element in $(\ONE+(f+b)\cdot g)$,
+leaving us $b\cdot g$ only.
+Our new lexer with stronger simplification
+uses $\textit{prune}$ by making it
+the core component of the deduplicating function
+called $\textit{distinctWith}$.
+$\textit{DistinctWith}$ ensures that all verbose
+parts of a regular expression are pruned away.
+\begin{figure}[H]
+\begin{lstlisting}
+def turnIntoTerms(r: Rexp): List[Rexp] = r match {
+case SEQ(r1, r2)  =>
+turnIntoTerms(r1).flatMap(r11 => furtherSEQ(r11, r2))
+case ALTS(r1, r2) => turnIntoTerms(r1) ::: turnIntoTerms(r2)
+case ZERO => Nil
+case _ => r :: Nil
+}
+def distinctWith(rs: List[ARexp],
+pruneFunction: (ARexp, Set[Rexp]) => ARexp,
+acc: Set[Rexp] = Set()) : List[ARexp] =
+rs match{
+case Nil => Nil
+case r :: rs =>
+if(acc(erase(r)))
+distinctWith(rs, pruneFunction, acc)
+else {
+val pruned_r = pruneFunction(r, acc)
+pruned_r ::
+distinctWith(rs,
+pruneFunction,
+turnIntoTerms(erase(pruned_r)) ++: acc
+)
+}
+}
 \end{lstlisting}
 \caption{A Stronger Version of $\textit{distinctBy}$}
 \end{figure}
 \noindent
-The function $\textit{prune}$ is used in $\distinctWith$.
+Once a regular expression has been pruned,
-$\distinctWith$ is a stronger version of $\distinctBy$
+all its components will be added to the accumulator
-which not only removes duplicates as $\distinctBy$ would
+to remove any future regular expressions' duplicate components.
-do, but also uses the $\textit{pruneFunction}$
-argument to prune away verbose components in a regular expression.\\
+The function $\textit{bsimpStrong}$
+is very much the same as $\textit{bsimp}$, just with
+$\textit{distinctBy}$ replaced
+by $\textit{distinctWith}$.
 \begin{figure}[H]
 \begin{lstlisting}
-//a stronger version of simp
 def bsimpStrong(r: ARexp): ARexp =
 {
 r match {
 case ASEQ(bs1, r1, r2) => (bsimpStrong(r1), bsimpStrong(r2)) match {
-//normal clauses same as simp
 case (AZERO, _) => AZERO
 case (_, AZERO) => AZERO
 case (AONE(bs2), r2s) => fuse(bs1 ++ bs2, r2s)
-//bs2 can be discarded
 case (r1s, AONE(bs2)) => fuse(bs1, r1s) //assert bs2 == Nil
 case (r1s, r2s) => ASEQ(bs1, r1s, r2s)
 }
 case AALTS(bs1, rs) => {
-//distinctBy(flat_res, erase)
 distinctWith(flats(rs.map(bsimpStrong(_))), prune) match {
 case Nil => AZERO
 case s :: Nil => fuse(bs1, s)
 case rs => AALTS(bs1, rs)
 }
 }
-//stars that can be treated as 1
 case ASTAR(bs, r0) if(atMostEmpty(erase(r0))) => AONE(bs)
 case r => r
 }
 }
-\end{lstlisting}
+def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
-\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
-\end{figure}
-\noindent
-$\distinctWith$, is in turn used in $\bsimpStrong$:
-\begin{figure}[H]
-\begin{lstlisting}
-//Conjecture: [| bdersStrong(s, r) |] = O([| r |]^3)
-def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
 case Nil => r
 case c::s => bdersStrong(s, bsimpStrong(bder(c, r)))
 }
 \end{lstlisting}
-\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
+\caption{The function
+$\textit{bsimpStrong}$: a stronger version of $\textit{bsimp}$}
 \end{figure}
 \noindent
+The benefits of using
+$\textit{prune}$ refining the finiteness bound
+to a cubic bound has not been formalised yet.
+Therefore we choose to use Scala code rather than an Isabelle-style formal
+definition like we did for $\simp$, as the definitions might change
+to suit our proof needs.
+In the rest of the chapter we will use this convention consistently.
+%The function $\textit{prune}$ is used in $\distinctWith$.
+%$\distinctWith$ is a stronger version of $\distinctBy$
+%which not only removes duplicates as $\distinctBy$ would
+%do, but also uses the $\textit{pruneFunction}$
+%argument to prune away verbose components in a regular expression.\\
+%\begin{figure}[H]
+%\begin{lstlisting}
+%   //a stronger version of simp
+%    def bsimpStrong(r: ARexp): ARexp =
+%    {
+%      r match {
+%        case ASEQ(bs1, r1, r2) => (bsimpStrong(r1), bsimpStrong(r2)) match {
+%          //normal clauses same as simp
+%        case (AZERO, _) => AZERO
+%        case (_, AZERO) => AZERO
+%        case (AONE(bs2), r2s) => fuse(bs1 ++ bs2, r2s)
+%        //bs2 can be discarded
+%        case (r1s, AONE(bs2)) => fuse(bs1, r1s) //assert bs2 == Nil
+%        case (r1s, r2s) => ASEQ(bs1, r1s, r2s)
+%        }
+%        case AALTS(bs1, rs) => {
+%          //distinctBy(flat_res, erase)
+%          distinctWith(flats(rs.map(bsimpStrong(_))), prune) match {
+%            case Nil => AZERO
+%            case s :: Nil => fuse(bs1, s)
+%            case rs => AALTS(bs1, rs)
+%          }
+%        }
+%        //stars that can be treated as 1
+%        case ASTAR(bs, r0) if(atMostEmpty(erase(r0))) => AONE(bs)
+%        case r => r
+%      }
+%    }
+%    def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
+%        case Nil => r
+%        case c::s => bdersStrong(s, bsimpStrong(bder(c, r)))
+%      }
+%\end{lstlisting}
+%\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
+%\end{figure}
+%\noindent
+%$\distinctWith$, is in turn used in $\bsimpStrong$:
+%\begin{figure}[H]
+%\begin{lstlisting}
+%      //Conjecture: [| bdersStrong(s, r) |] = O([| r |]^3)
+%      def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
+%        case Nil => r
+%        case c::s => bdersStrong(s, bsimpStrong(bder(c, r)))
+%      }
+%\end{lstlisting}
+%\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
+%\end{figure}
+%\noindent
 We conjecture that the above Scala function $\bdersStrongs$,
 written $\bdersStrong{\_}{\_}$ as an infix notation,
 satisfies the following property:
 \begin{conjecture}
 	$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
 \end{conjecture}
+\noindent
 The stronger version of $\blexersimp$'s
 code in Scala looks like:
 \begin{figure}[H]
 \begin{lstlisting}
 def strongBlexer(r: Rexp, s: String) : Option[Val] = {
 }
 \end{lstlisting}
 \end{figure}
 \noindent
 We call this lexer $\blexerStrong$.
-$\blexerStrong$ is able to drastically reduce the
+This version is able to drastically reduce the
-internal data structure size which could
+internal data structure size which
+otherwise could
 trigger exponential behaviours in
 $\blexersimp$.
 \begin{figure}[H]
 \centering
 \begin{tabular}{@{}c@{\hspace{0mm}}c@{\hspace{0mm}}c@{}}
 \end{conjecture}
 \noindent
 The idea is to maintain key lemmas in
 chapter \ref{Bitcoded2} like
 $r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$
-with the new rewriting rule \ref{cubicRule} .
+with the new rewriting rule
+shown in figure \ref{fig:cubicRule} .
 In the next sub-section,
 we will describe why we
-believe a cubic bound can be achieved.
+believe a cubic size bound can be achieved with
-We give an introduction to the
+the stronger simplification.
+For this we give a short introduction to the
 partial derivatives,
-which was invented by Antimirov \cite{Antimirov95},
+which were invented by Antimirov \cite{Antimirov95},
-and then link it with the result of the function
+and then link them with the result of the function
 $\bdersStrongs$.
 \subsection{Antimirov's partial derivatives}
 Partial derivatives were first introduced by
 Antimirov \cite{Antimirov95}.
-It does derivatives in a similar way as suggested by Brzozowski,
+Partial derivatives are very similar
+to Brzozowski derivatives,
 but splits children of alternative regular expressions into
-multiple independent terms, causing the output to become a
+multiple independent terms. This means the output of
+partial derivatives become a
 set of regular expressions:
 \begin{center}
 	\begin{tabular}{lcl}
-		$\partial_x \; (a \cdot b)$ &
+		$\partial_x \; (r_1 \cdot r_2)$ &
-		$\dn$ & $\partial_x \; a\cdot b \cup
+		$\dn$ & $(\partial_x \; r_1) \cdot r_2 \cup
-		\partial_x \; b \; \textit{if} \; \; \nullable\; a$\\
+		\partial_x \; r_2 \; \textit{if} \; \; \nullable\; r_1$\\
-		      & & $\partial_x \; a\cdot b \quad\quad
+		      & & $(\partial_x \; r_1)\cdot r_2 \quad\quad
 		      \textit{otherwise}$\\
-		$\partial_x \; r^*$ & $\dn$ & $\partial_x \; r \cdot r^*$\\
+		$\partial_x \; r^*$ & $\dn$ & $(\partial_x \; r) \cdot r^*$\\
 		$\partial_x \; c $ & $\dn$ & $\textit{if} \; x = c \;
 		\textit{then} \;
 		\{ \ONE\} \;\;\textit{else} \; \varnothing$\\
-		$\partial_x(a+b)$ & $=$ & $\partial_x(a) \cup \partial_x(b)$\\
+		$\partial_x(r_1+r_2)$ & $=$ & $\partial_x(r_1) \cup \partial_x(r_2)$\\
 		$\partial_x(\ONE)$ & $=$ & $\varnothing$\\
 		$\partial_x(\ZERO)$ & $\dn$ & $\varnothing$\\
 	\end{tabular}
 \end{center}
 \noindent
 The $\cdot$ between for example
-$\partial_x \; a\cdot b $
+$(\partial_x \; r_1) \cdot r_2 $
 is a shorthand notation for the cartesian product
-$\partial_x \; a \times \{ b\}$.
+$(\partial_x \; r_1) \times \{ r_2\}$.
 %Each element in the set generated by a partial derivative
 %corresponds to a (potentially partial) match
 %TODO: define derivatives w.r.t string s
-Rather than joining the calculated derivatives $\partial_x a$ and $\partial_x b$ together
+Rather than joining the calculated derivatives $\partial_x r_1$ and $\partial_x r_2$ together
 using the $\sum$ constructor, Antimirov put them into
-a set.  This causes maximum de-duplication to happen,
+a set.  This means many subterms will be de-duplicated
-allowing us to understand what are the "atomic" components of it.
+because they are sets,
-For example, To compute what regular expression $x^*(xx + y)^*$'s
+For example, to compute what regular expression $x^*(xx + y)^*$'s
-derivative against $x$ is made of, one can do a partial derivative
+derivative w.r.t. $x$ is, one can compute a partial derivative
-of it and get two singleton sets $\{x^* \cdot (xx + y)^*\}$ and $\{x \cdot (xx + y) ^* \}$
+and get two singleton sets $\{x^* \cdot (xx + y)^*\}$ and $\{x \cdot (xx + y) ^* \}$
 from $\partial_x(x^*) \cdot (xx + y) ^*$ and $\partial_x((xx + y)^*)$.
+The partial derivative w.r.t. a string is defined recursively:
+\[
+	\partial_{c::cs} r \dn \bigcup_{r'\in (\partial_c r)}
+	\partial_{cs} r'
+\]
+Given an alphabet $\Sigma$, we denote the set of all possible strings
+from this alphabet as $\Sigma^*$.
 The set of all possible partial derivatives is defined
 as the union of derivatives w.r.t all the strings in the universe:
 \begin{center}
 	\begin{tabular}{lcl}
-		$\textit{PDER}_{UNIV} \; r $ & $\dn $ & $\bigcup_{w \in A^*}\partial_w \; r$
+		$\textit{PDER}_{\Sigma^*} \; r $ & $\dn $ & $\bigcup_{w \in \Sigma^*}\partial_w \; r$
 	\end{tabular}
 \end{center}
 \noindent
+Consider now again our pathological case where the derivatives
-Back to our
+grow with a rather aggressive simplification
 \begin{center}
 	$((a^* + (aa)^* + \ldots + (\underbrace{a\ldots a}_{n a's})^* )^*)^*$
 \end{center}
-example, if we denote this regular expression as $A$,
+example, if we denote this regular expression as $r$,
 we have that
 \begin{center}
-$\textit{PDER}_{UNIV} \; A =
+$\textit{PDER}_{\Sigma^*} \; r =
 \bigcup_{i=1}^{n}\bigcup_{j=0}^{i-1} \{
 	(\underbrace{a \ldots a}_{\text{j a's}}\cdot
-(\underbrace{a \ldots a}_{\text{i a's}})^*)\cdot A \}$,
+(\underbrace{a \ldots a}_{\text{i a's}})^*)\cdot r \}$,
 \end{center}
 with exactly $n * (n + 1) / 2$ terms.
 This is in line with our speculation that only $n*(n+1)/2$ terms are
 needed. We conjecture that $\bsimpStrong$ is also able to achieve this
 upper limit in general
 \begin{conjecture}\label{bsimpStrongInclusionPder}
 	Using a suitable transformation $f$, we have
 	\begin{center}
 		$\forall s.\; f \; (r \bdersStrong  \; s) \subseteq
-		 \textit{PDER}_{UNIV} \; r$
+		 \textit{PDER}_{\Sigma^*} \; r$
 	\end{center}
 \end{conjecture}
 \noindent
-because our \ref{cubicRule} will keep only one copy of each term,
+because our \ref{fig:cubicRule} will keep only one copy of each term,
 where the function $\textit{prune}$ takes care of maintaining
 a set like structure similar to partial derivatives.
-It is anticipated we might need to adjust $\textit{prune}$
+We might need to adjust $\textit{prune}$
 slightly to make sure all duplicate terms are eliminated,
 which should be doable.
 Antimirov had proven that the sum of all the partial derivative
 terms' sizes is bounded by the cubic of the size of that regular
 expression:
 \begin{property}\label{pderBound}
-	$\llbracket \textit{PDER}_{UNIV} \; r \rrbracket \leq O((\llbracket r \rrbracket)^3)$
+	$\llbracket \textit{PDER}_{\Sigma^*} \; r \rrbracket \leq O(\llbracket r \rrbracket^3)$
 \end{property}
-This property was formalised by Urban, and the details are in the PDERIVS.thy file
+This property was formalised by Wu et al. \cite{Wu2014}, and the
-in our repository.
+details can be found in the archive of formal proofs. \footnote{https://www.isa-afp.org/entries/Myhill-Nerode.html}
 Once conjecture \ref{bsimpStrongInclusionPder} is proven, then property \ref{pderBound}
-would yield us a cubic bound for our $\blexerStrong$ algorithm:
+would yield us also a cubic bound for our $\blexerStrong$ algorithm:
 \begin{conjecture}\label{strongCubic}
 	$\llbracket r \bdersStrong\; s \rrbracket \leq \llbracket r \rrbracket^3$
 \end{conjecture}
+\noindent
+We leave this as future work.
 %To get all the "atomic" components of a regular expression's possible derivatives,
 %there is a procedure Antimirov called $\textit{lf}$, short for "linear forms", that takes
 %whatever character is available at the head of the string inside the language of a
 %		cs \; r \; [\Some \; ([c], n)]$\\
 %		$\ntset \; r\; 0 \; \_$ &  $\dn$ &  $\None$\\
 %		$\ntset \; r \; \_ \; [] $ & $ \dn$ & $[]$\\
 %	\end{tabular}
 %\end{center}

changeset 628	7af4e2420a8c
parent 625	b797c9a709d9
child 630	d50a309a0645