diff -r 94db2636a296 -r 7af4e2420a8c ChengsongTanPhdThesis/Chapters/Cubic.tex
--- a/ChengsongTanPhdThesis/Chapters/Cubic.tex	Tue Nov 22 12:50:04 2022 +0000
+++ b/ChengsongTanPhdThesis/Chapters/Cubic.tex	Sat Nov 26 16:18:10 2022 +0000
@@ -16,7 +16,7 @@
 Nevertheless, we outline the ideas we intend to use for the proof.
 
 We present further improvements
-made to our lexer algorithm $\blexersimp$.
+for our lexer algorithm $\blexersimp$.
 We devise a stronger simplification algorithm,
 called $\bsimpStrong$, which can prune away
 similar components in two regular expressions at the same 
@@ -24,6 +24,9 @@
 even if these regular expressions are not exactly the same.
 We call the lexer that uses this stronger simplification function
 $\blexerStrong$.
+Unfortunately we did not have time to 
+work out the proofs, like in
+the previous chapters.
 We conjecture that both
 \begin{center}
 	$\blexerStrong \;r \; s = \blexer\; r\;s$
@@ -32,10 +35,11 @@
 \begin{center}
 	$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
 \end{center}
-hold, but formalising
-them is still work in progress.
-We give reasons why the correctness and cubic size bound proofs
-can be achieved,
+hold, but a formalisation
+is still future work.
+We give an informal justification 
+why the correctness and cubic size bound proofs
+can be achieved
 by exploring the connection between the internal
 data structure of our $\blexerStrong$ and
 Animirov's partial derivatives.\\
@@ -61,10 +65,11 @@
 	aa \cdot a^*+ a \cdot a^* + aa\cdot a^*
 \]
 contains three terms, 
-expressing three possibilities it will match future input.
+expressing three possibilities for how it can match some input.
 The first and the third terms are identical, which means we can eliminate
-the latter as we know it will not be picked up by $\bmkeps$.
-In $\bsimps$, the $\distinctBy$ function takes care of this.
+the latter as it will not contribute to a POSIX value.
+In $\bsimps$, the $\distinctBy$ function takes care of 
+such instances.
 The criteria $\distinctBy$ uses for removing a duplicate
 $a_2$ in the list
 \begin{center}
@@ -74,24 +79,29 @@
 \begin{center}
 	$\rerase{a_1} = \rerase{a_2}$.
 \end{center}
-It can be characterised as the $LD$ 
-rewrite rule in \ref{rrewriteRules}.\\
+It is characterised as the $LD$ 
+rewrite rule in figure \ref{rrewriteRules}.\\
 The problem , however, is that identical components
-in two slightly different regular expressions cannot be removed:
-\begin{figure}[H]
-\[
+in two slightly different regular expressions cannot be removed.
+Consider the simplification
+\begin{equation}
+	\label{eqn:partialDedup}
 	(a+b+d) \cdot r_1 + (a+c+e) \cdot r_1 \stackrel{?}{\rightsquigarrow} (a+b+d) \cdot r_1 + (c+e) \cdot r_1
-\]
-\caption{Desired simplification, but not done in $\blexersimp$}\label{partialDedup}
-\end{figure}
-\noindent
-A simplification like this actually
+\end{equation}
+where the $(a+\ldots)\cdot r_1$ is deleted in the right alternative.
+This is permissible because we have $(a+\ldots)\cdot r_1$ in the left
+alternative.
+The difficulty is that such  ``buried''
+alternatives-sequences are not easily recognised.
+But simplification like this actually
 cannot be omitted,
-as without it the size could blow up even with our $\textit{bsimp}$ 
-function: for the chapter \ref{Finite} example
+as without it the size of derivatives can still
+blow up even with our $\textit{bsimp}$ 
+function: 
+consider again the example
 $\protect((a^* + (aa)^* + \ldots + (\underbrace{a\ldots a}_{n a's})^* )^*)^*$,
-by just setting n to a small number,
-we get exponential growth that does not stop before it becomes huge: 
+and set $n$ to a relatively small number,
+we get exponential growth:
 \begin{figure}[H]
 \centering
 \begin{tikzpicture}
@@ -104,22 +114,20 @@
 \addplot[blue,mark=*, mark options={fill=white}] table {bsimpExponential.data};
 \end{axis}
 \end{tikzpicture}
-\caption{Runtime of $\blexersimp$ for matching 
+\caption{Size of derivatives of $\blexersimp$ for matching 
 	$\protect((a^* + (aa)^* + \ldots + (aaaaa)^* )^*)^*$ 
 	with strings 
 	of the form $\protect\underbrace{aa..a}_{n}$.}\label{blexerExp}
 \end{figure}
 \noindent
-We would like to apply the rewriting at some stage
-\begin{figure}[H]
+One possible approach would be to apply the rewriting
+rule
 \[
 	(a+b+d) \cdot r_1  \longrightarrow a \cdot r_1 + b \cdot r_1 + d \cdot r_1
 \]
-\caption{Desired simplification, but not done in $\blexersimp$}\label{desiredSimp}
-\end{figure}
 \noindent
 in our $\simp$ function,
-so that it makes the simplification in \ref{partialDedup} possible.
+so that it makes the simplification in \eqref{eqn:partialDedup} possible.
 Translating the rule into our $\textit{bsimp}$ function simply
 involves adding a new clause to the $\textit{bsimp}_{ASEQ}$ function:
 \begin{center}
@@ -131,9 +139,7 @@
    &&$\quad\textit{case} \; (a_1', a_2') \Rightarrow   _{bs}a_1' \cdot a_2'$ \\
 	\end{tabular}
 \end{center}
-
-
-
+\noindent
 Unfortunately,
 if we introduce them in our
 setting we would lose the POSIX property of our calculated values. 
@@ -141,10 +147,7 @@
 \begin{center}
 	$(a + ab)(bc + c)$
 \end{center}
-and the string
-\begin{center}
-	$ab$,
-\end{center}
+and the string $ab$,
 then our algorithm generates the following
 correct POSIX value
 \begin{center}
@@ -165,11 +168,11 @@
 	$a\cdot(b c + c) + ab \cdot (bc + c)$,
 \end{center}
 which becomes a regular expression with a 
-totally different structure--the original
+quite different structure--the original
 was a sequence, and now it becomes an alternative.
-With an alternative the maximum munch rule no longer works.\\
+With an alternative the maximal munch rule no longer works.\\
 A method to reconcile this is to do the 
-transformation in \ref{desiredSimp} ``non-invasively'',
+transformation in \eqref{eqn:partialDedup} ``non-invasively'',
 meaning that we traverse the list of regular expressions
 \begin{center}
 	$rs_a@[a]@rs_c$
@@ -181,11 +184,16 @@
 using  a function similar to $\distinctBy$,
 but this time 
 we allow a more general list rewrite:
-\begin{mathpar}\label{cubicRule}
+\begin{figure}[H]
+\begin{mathpar}	
 	\inferrule * [Right = cubicRule]{\vspace{0mm} }{rs_a@[a]@rs_c 
 			\stackrel{s}{\rightsquigarrow }
 		rs_a@[\textit{prune} \; a \; rs_a]@rs_c }
 \end{mathpar}
+\caption{The rule capturing the pruning simplification needed to achieve
+a cubic bound}
+\label{fig:cubicRule}
+\end{figure}
 %L \; a_1' = L \; a_1 \setminus (\cup_{a \in rs_a} L \; a)
 where $\textit{prune} \;a \; acc$ traverses $a$
 without altering the structure of $a$, removing components in $a$
@@ -205,45 +213,11 @@
 $[(r_a+r_b+r_c)r_d, (r_e+r_f)r_d]$.
 \end{center}
 We implemented 
-function $\textit{prune}$ in Scala,
-and incorporated into our lexer,
-by replacing the $\simp$ function 
-with a stronger version called $\bsimpStrong$
-that prunes regular expressions.
+function $\textit{prune}$ in Scala:
 \begin{figure}[H]
-
 \begin{lstlisting}
-    def atMostEmpty(r: Rexp) : Boolean = r match {
-      case ZERO => true
-      case ONE => true
-      case STAR(r) => atMostEmpty(r)
-      case SEQ(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
-      case ALTS(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
-      case CHAR(_) => false
-    }
-
-
-    def isOne(r: Rexp) : Boolean = r match {
-      case ONE => true
-      case SEQ(r1, r2) => isOne(r1) && isOne(r2)
-      case ALTS(r1, r2) => (isOne(r1) || isOne(r2)) && (atMostEmpty(r1) && atMostEmpty(r2))//rs.forall(atMostEmpty) && rs.exists(isOne)
-      case STAR(r0) => atMostEmpty(r0)
-      case CHAR(c) => false
-      case ZERO => false
-    }
-
-    //r = r' ~ tail' : If tail' matches tail => returns r'
-    def removeSeqTail(r: Rexp, tail: Rexp) : Rexp = r match {
-      case SEQ(r1, r2) => 
-        if(r2 == tail) 
-          r1
-        else
-          ZERO
-      case r => ZERO
-    }
-
     def prune(r: ARexp, acc: Set[Rexp]) : ARexp = r match{
-      case AALTS(bs, rs) => rs.map(r => prune(r, acc)).filter(_ != ZERO) match
+      case AALTS(bs, rs) => rs.map(r => prune(r, acc)).filter(_ != AZERO) match
       {
         //all components have been removed, meaning this is effectively a duplicate
         //flats will take care of removing this AZERO 
@@ -265,94 +239,250 @@
       case r => if(acc(erase(r))) AZERO else r
     }
 \end{lstlisting}
-\caption{pruning function together with its helper functions}
+\caption{The function $\textit{prune}$ }
 \end{figure}
 \noindent
-The benefits of using 
-$\textit{prune}$ such as refining the finiteness bound
-to a cubic bound has not been formalised yet.
-Therefore we choose to use Scala code rather than an Isabelle-style formal 
-definition like we did for $\simp$, as the definitions might change
-to suit proof needs.
-In the rest of the chapter we will use this convention consistently.
+The function $\textit{prune}$ 
+is a stronger version of $\textit{distinctBy}$.
+It does not just walk through a list looking for exact duplicates,
+but prunes sub-expressions recursively.
+It manages proper contexts by the helper functions
+$\textit{removeSeqTail}$, $\textit{isOne}$ and $\textit{atMostEmpty}$.
 \begin{figure}[H]
 \begin{lstlisting}
-  def distinctWith(rs: List[ARexp], 
-    pruneFunction: (ARexp, Set[Rexp]) => ARexp, 
-    acc: Set[Rexp] = Set()) : List[ARexp] = 
-      rs match{
-        case Nil => Nil
-        case r :: rs => 
-          if(acc(erase(r)))
-            distinctWith(rs, pruneFunction, acc)
-          else {
-            val pruned_r = pruneFunction(r, acc)
-            pruned_r :: 
-            distinctWith(rs, 
-              pruneFunction, 
-              turnIntoTerms(erase(pruned_r)) ++: acc
-              )
-          }
+    def atMostEmpty(r: Rexp) : Boolean = r match {
+      case ZERO => true
+      case ONE => true
+      case STAR(r) => atMostEmpty(r)
+      case SEQ(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
+      case ALTS(r1, r2) => atMostEmpty(r1) && atMostEmpty(r2)
+      case CHAR(_) => false
+    }
+
+    def isOne(r: Rexp) : Boolean = r match {
+      case ONE => true
+      case SEQ(r1, r2) => isOne(r1) && isOne(r2)
+      case ALTS(r1, r2) => (isOne(r1) || isOne(r2)) && (atMostEmpty(r1) && atMostEmpty(r2))
+      case STAR(r0) => atMostEmpty(r0)
+      case CHAR(c) => false
+      case ZERO => false
+    }
+
+    def removeSeqTail(r: Rexp, tail: Rexp) : Rexp = 
+      if (r == tail)
+        ONE
+      else {
+        r match {
+          case SEQ(r1, r2) => 
+            if(r2 == tail) 
+              r1
+            else
+              ZERO
+          case r => ZERO
+        }
       }
+
+
+\end{lstlisting}
+\caption{The helper functions of $\textit{prune}$}
+\end{figure}
+\noindent
+Suppose we feed 
+\begin{center}
+	$r= (\underline{\ONE}+(\underline{f}+b)\cdot g)\cdot (a\cdot(d\cdot e))$
+\end{center}
+and 
+\begin{center}
+	$acc = \{a\cdot(d\cdot e),f\cdot (g \cdot (a \cdot (d \cdot e))) \}$
+\end{center}
+as the input for $\textit{prune}$.
+The end result will be
+\[
+	b\cdot(g\cdot(a\cdot(d\cdot e)))
+\]
+where the underlined components in $r$ are eliminated.
+Looking more closely, at the topmost call 
+\[
+	\textit{prune} \quad (\ONE+
+	(f+b)\cdot g)\cdot (a\cdot(d\cdot e)) \quad
+	\{a\cdot(d\cdot e),f\cdot (g \cdot (a \cdot (d \cdot e))) \}
+\]
+The sequence clause will be called,
+where a sub-call
+\[
+	\textit{prune} \;\; (\ONE+(f+b)\cdot g)\;\; \{\ONE, f\cdot g \}
+\]
+is made. The terms in the new accumulator $\{\ONE,\; f\cdot g \}$ come from
+the two calls to $\textit{removeSeqTail}$:
+\[
+	\textit{removeSeqTail} \quad\;\; a \cdot(d\cdot e) \quad\;\; a \cdot(d\cdot e) 
+\]
+and
+\[
+	\textit{removeSeqTail} \quad \;\; 
+	f\cdot(g\cdot (a \cdot(d\cdot e)))\quad  \;\; a \cdot(d\cdot e).
+\]
+The idea behind $\textit{removeSeqTail}$ is that
+when pruning recursively, we need to ``zoom in''
+to sub-expressions, and this ``zoom in'' needs to be performed
+on the
+accumulators as well, otherwise we will be comparing
+apples with oranges.
+The sub-call 
+$\textit{prune} \;\; (\ONE+(f+b)\cdot g)\;\; \{\ONE, f\cdot g \}$
+is simpler, which will trigger the alternative clause, causing
+a pruning on each element in $(\ONE+(f+b)\cdot g)$,
+leaving us $b\cdot g$ only.
+
+Our new lexer with stronger simplification
+uses $\textit{prune}$ by making it 
+the core component of the deduplicating function
+called $\textit{distinctWith}$.
+$\textit{DistinctWith}$ ensures that all verbose
+parts of a regular expression are pruned away.
+
+\begin{figure}[H]
+\begin{lstlisting}
+    def turnIntoTerms(r: Rexp): List[Rexp] = r match {
+      case SEQ(r1, r2)  => 
+        turnIntoTerms(r1).flatMap(r11 => furtherSEQ(r11, r2))
+          case ALTS(r1, r2) => turnIntoTerms(r1) ::: turnIntoTerms(r2)
+          case ZERO => Nil
+          case _ => r :: Nil
+    }
+
+    def distinctWith(rs: List[ARexp], 
+      pruneFunction: (ARexp, Set[Rexp]) => ARexp, 
+      acc: Set[Rexp] = Set()) : List[ARexp] = 
+        rs match{
+          case Nil => Nil
+          case r :: rs => 
+            if(acc(erase(r)))
+              distinctWith(rs, pruneFunction, acc)
+            else {
+              val pruned_r = pruneFunction(r, acc)
+              pruned_r :: 
+              distinctWith(rs, 
+                pruneFunction, 
+                turnIntoTerms(erase(pruned_r)) ++: acc
+                )
+            }
+        }
+
 \end{lstlisting}
 \caption{A Stronger Version of $\textit{distinctBy}$}
 \end{figure}
 \noindent
-The function $\textit{prune}$ is used in $\distinctWith$.
-$\distinctWith$ is a stronger version of $\distinctBy$
-which not only removes duplicates as $\distinctBy$ would
-do, but also uses the $\textit{pruneFunction}$
-argument to prune away verbose components in a regular expression.\\
+Once a regular expression has been pruned,
+all its components will be added to the accumulator
+to remove any future regular expressions' duplicate components.
+
+The function $\textit{bsimpStrong}$
+is very much the same as $\textit{bsimp}$, just with
+$\textit{distinctBy}$ replaced 
+by $\textit{distinctWith}$.
 \begin{figure}[H]
 \begin{lstlisting}
-   //a stronger version of simp
+
     def bsimpStrong(r: ARexp): ARexp =
     {
       r match {
         case ASEQ(bs1, r1, r2) => (bsimpStrong(r1), bsimpStrong(r2)) match {
-          //normal clauses same as simp
         case (AZERO, _) => AZERO
         case (_, AZERO) => AZERO
         case (AONE(bs2), r2s) => fuse(bs1 ++ bs2, r2s)
-        //bs2 can be discarded
         case (r1s, AONE(bs2)) => fuse(bs1, r1s) //assert bs2 == Nil
         case (r1s, r2s) => ASEQ(bs1, r1s, r2s)
         }
         case AALTS(bs1, rs) => {
-          //distinctBy(flat_res, erase)
           distinctWith(flats(rs.map(bsimpStrong(_))), prune) match {
             case Nil => AZERO
             case s :: Nil => fuse(bs1, s)
             case rs => AALTS(bs1, rs)
           }
         }
-        //stars that can be treated as 1
         case ASTAR(bs, r0) if(atMostEmpty(erase(r0))) => AONE(bs)
         case r => r
       }
     }
-\end{lstlisting}
-\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
-\end{figure}
-\noindent
-$\distinctWith$, is in turn used in $\bsimpStrong$:
-\begin{figure}[H]
-\begin{lstlisting}
-      //Conjecture: [| bdersStrong(s, r) |] = O([| r |]^3)
-      def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
+    def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
         case Nil => r
         case c::s => bdersStrong(s, bsimpStrong(bder(c, r)))
       }
+
 \end{lstlisting}
-\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
+\caption{The function
+$\textit{bsimpStrong}$: a stronger version of $\textit{bsimp}$}
 \end{figure}
 \noindent
+The benefits of using 
+$\textit{prune}$ refining the finiteness bound
+to a cubic bound has not been formalised yet.
+Therefore we choose to use Scala code rather than an Isabelle-style formal 
+definition like we did for $\simp$, as the definitions might change
+to suit our proof needs.
+In the rest of the chapter we will use this convention consistently.
+
+%The function $\textit{prune}$ is used in $\distinctWith$.
+%$\distinctWith$ is a stronger version of $\distinctBy$
+%which not only removes duplicates as $\distinctBy$ would
+%do, but also uses the $\textit{pruneFunction}$
+%argument to prune away verbose components in a regular expression.\\
+%\begin{figure}[H]
+%\begin{lstlisting}
+%   //a stronger version of simp
+%    def bsimpStrong(r: ARexp): ARexp =
+%    {
+%      r match {
+%        case ASEQ(bs1, r1, r2) => (bsimpStrong(r1), bsimpStrong(r2)) match {
+%          //normal clauses same as simp
+%        case (AZERO, _) => AZERO
+%        case (_, AZERO) => AZERO
+%        case (AONE(bs2), r2s) => fuse(bs1 ++ bs2, r2s)
+%        //bs2 can be discarded
+%        case (r1s, AONE(bs2)) => fuse(bs1, r1s) //assert bs2 == Nil
+%        case (r1s, r2s) => ASEQ(bs1, r1s, r2s)
+%        }
+%        case AALTS(bs1, rs) => {
+%          //distinctBy(flat_res, erase)
+%          distinctWith(flats(rs.map(bsimpStrong(_))), prune) match {
+%            case Nil => AZERO
+%            case s :: Nil => fuse(bs1, s)
+%            case rs => AALTS(bs1, rs)
+%          }
+%        }
+%        //stars that can be treated as 1
+%        case ASTAR(bs, r0) if(atMostEmpty(erase(r0))) => AONE(bs)
+%        case r => r
+%      }
+%    }
+%    def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
+%        case Nil => r
+%        case c::s => bdersStrong(s, bsimpStrong(bder(c, r)))
+%      }
+%\end{lstlisting}
+%\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
+%\end{figure}
+%\noindent
+%$\distinctWith$, is in turn used in $\bsimpStrong$:
+%\begin{figure}[H]
+%\begin{lstlisting}
+%      //Conjecture: [| bdersStrong(s, r) |] = O([| r |]^3)
+%      def bdersStrong(s: List[Char], r: ARexp) : ARexp = s match {
+%        case Nil => r
+%        case c::s => bdersStrong(s, bsimpStrong(bder(c, r)))
+%      }
+%\end{lstlisting}
+%\caption{The function $\bsimpStrong$ and $\bdersStrongs$}
+%\end{figure}
+%\noindent
 We conjecture that the above Scala function $\bdersStrongs$,
 written $\bdersStrong{\_}{\_}$ as an infix notation, 
 satisfies the following property:
 \begin{conjecture}
 	$\llbracket \bdersStrong{a}{s} \rrbracket = O(\llbracket a \rrbracket^3)$
 \end{conjecture}
+\noindent
 The stronger version of $\blexersimp$'s
 code in Scala looks like: 
 \begin{figure}[H]
@@ -376,8 +506,9 @@
 \end{figure}
 \noindent
 We call this lexer $\blexerStrong$.
-$\blexerStrong$ is able to drastically reduce the
-internal data structure size which could
+This version is able to drastically reduce the
+internal data structure size which 
+otherwise could
 trigger exponential behaviours in
 $\blexersimp$.
 \begin{figure}[H]
@@ -424,77 +555,88 @@
 The idea is to maintain key lemmas in
 chapter \ref{Bitcoded2} like
 $r \stackrel{*}{\rightsquigarrow} \textit{bsimp} \; r$
-with the new rewriting rule \ref{cubicRule} .
+with the new rewriting rule 
+shown in figure \ref{fig:cubicRule} .
 
 In the next sub-section,
 we will describe why we 
-believe a cubic bound can be achieved.
-We give an introduction to the
+believe a cubic size bound can be achieved with
+the stronger simplification.
+For this we give a short introduction to the
 partial derivatives,
-which was invented by Antimirov \cite{Antimirov95},
-and then link it with the result of the function
+which were invented by Antimirov \cite{Antimirov95},
+and then link them with the result of the function
 $\bdersStrongs$.
  
 \subsection{Antimirov's partial derivatives}
 Partial derivatives were first introduced by 
 Antimirov \cite{Antimirov95}.
-It does derivatives in a similar way as suggested by Brzozowski, 
+Partial derivatives are very similar
+to Brzozowski derivatives,
 but splits children of alternative regular expressions into 
-multiple independent terms, causing the output to become a 
+multiple independent terms. This means the output of
+partial derivatives become a 
 set of regular expressions:
 \begin{center}  
  	\begin{tabular}{lcl}
-		$\partial_x \; (a \cdot b)$ & 
-		$\dn$ & $\partial_x \; a\cdot b \cup
-		\partial_x \; b \; \textit{if} \; \; \nullable\; a$\\
-		      & & $\partial_x \; a\cdot b \quad\quad 
+		$\partial_x \; (r_1 \cdot r_2)$ & 
+		$\dn$ & $(\partial_x \; r_1) \cdot r_2 \cup
+		\partial_x \; r_2 \; \textit{if} \; \; \nullable\; r_1$\\
+		      & & $(\partial_x \; r_1)\cdot r_2 \quad\quad 
 		      \textit{otherwise}$\\ 
-		$\partial_x \; r^*$ & $\dn$ & $\partial_x \; r \cdot r^*$\\
+		$\partial_x \; r^*$ & $\dn$ & $(\partial_x \; r) \cdot r^*$\\
 		$\partial_x \; c $ & $\dn$ & $\textit{if} \; x = c \; 
 		\textit{then} \;
 		\{ \ONE\} \;\;\textit{else} \; \varnothing$\\
-		$\partial_x(a+b)$ & $=$ & $\partial_x(a) \cup \partial_x(b)$\\
+		$\partial_x(r_1+r_2)$ & $=$ & $\partial_x(r_1) \cup \partial_x(r_2)$\\
 		$\partial_x(\ONE)$ & $=$ & $\varnothing$\\
 		$\partial_x(\ZERO)$ & $\dn$ & $\varnothing$\\
 	\end{tabular}
 \end{center}
 \noindent
 The $\cdot$ between for example 
-$\partial_x \; a\cdot b $ 
+$(\partial_x \; r_1) \cdot r_2 $ 
 is a shorthand notation for the cartesian product 
-$\partial_x \; a \times \{ b\}$.
+$(\partial_x \; r_1) \times \{ r_2\}$.
 %Each element in the set generated by a partial derivative
 %corresponds to a (potentially partial) match
 %TODO: define derivatives w.r.t string s
-Rather than joining the calculated derivatives $\partial_x a$ and $\partial_x b$ together
+Rather than joining the calculated derivatives $\partial_x r_1$ and $\partial_x r_2$ together
 using the $\sum$ constructor, Antimirov put them into
-a set.  This causes maximum de-duplication to happen, 
-allowing us to understand what are the "atomic" components of it.
-For example, To compute what regular expression $x^*(xx + y)^*$'s 
-derivative against $x$ is made of, one can do a partial derivative
-of it and get two singleton sets $\{x^* \cdot (xx + y)^*\}$ and $\{x \cdot (xx + y) ^* \}$
+a set.  This means many subterms will be de-duplicated
+because they are sets, 
+For example, to compute what regular expression $x^*(xx + y)^*$'s 
+derivative w.r.t. $x$ is, one can compute a partial derivative
+and get two singleton sets $\{x^* \cdot (xx + y)^*\}$ and $\{x \cdot (xx + y) ^* \}$
 from $\partial_x(x^*) \cdot (xx + y) ^*$ and $\partial_x((xx + y)^*)$.
 
+The partial derivative w.r.t. a string is defined recursively:
+\[
+	\partial_{c::cs} r \dn \bigcup_{r'\in (\partial_c r)}
+	\partial_{cs} r'
+\]
+Given an alphabet $\Sigma$, we denote the set of all possible strings
+from this alphabet as $\Sigma^*$. 
 The set of all possible partial derivatives is defined
 as the union of derivatives w.r.t all the strings in the universe:
 \begin{center}
 	\begin{tabular}{lcl}
-		$\textit{PDER}_{UNIV} \; r $ & $\dn $ & $\bigcup_{w \in A^*}\partial_w \; r$
+		$\textit{PDER}_{\Sigma^*} \; r $ & $\dn $ & $\bigcup_{w \in \Sigma^*}\partial_w \; r$
 	\end{tabular}
 \end{center}
 \noindent
-
-Back to our 
+Consider now again our pathological case where the derivatives
+grow with a rather aggressive simplification
 \begin{center}
 	$((a^* + (aa)^* + \ldots + (\underbrace{a\ldots a}_{n a's})^* )^*)^*$
 \end{center}
-example, if we denote this regular expression as $A$,
+example, if we denote this regular expression as $r$,
 we have that
 \begin{center}
-$\textit{PDER}_{UNIV} \; A = 
+$\textit{PDER}_{\Sigma^*} \; r = 
 \bigcup_{i=1}^{n}\bigcup_{j=0}^{i-1} \{ 
 	(\underbrace{a \ldots a}_{\text{j a's}}\cdot
-(\underbrace{a \ldots a}_{\text{i a's}})^*)\cdot A \}$, 
+(\underbrace{a \ldots a}_{\text{i a's}})^*)\cdot r \}$, 
 \end{center}
 with exactly $n * (n + 1) / 2$ terms.
 This is in line with our speculation that only $n*(n+1)/2$ terms are
@@ -504,14 +646,14 @@
 	Using a suitable transformation $f$, we have 
 	\begin{center}
 		$\forall s.\; f \; (r \bdersStrong  \; s) \subseteq
-		 \textit{PDER}_{UNIV} \; r$
+		 \textit{PDER}_{\Sigma^*} \; r$
 	\end{center}
 \end{conjecture}
 \noindent
-because our \ref{cubicRule} will keep only one copy of each term,
+because our \ref{fig:cubicRule} will keep only one copy of each term,
 where the function $\textit{prune}$ takes care of maintaining
 a set like structure similar to partial derivatives.
-It is anticipated we might need to adjust $\textit{prune}$
+We might need to adjust $\textit{prune}$
 slightly to make sure all duplicate terms are eliminated,
 which should be doable.
 
@@ -519,15 +661,17 @@
 terms' sizes is bounded by the cubic of the size of that regular
 expression:
 \begin{property}\label{pderBound}
-	$\llbracket \textit{PDER}_{UNIV} \; r \rrbracket \leq O((\llbracket r \rrbracket)^3)$
+	$\llbracket \textit{PDER}_{\Sigma^*} \; r \rrbracket \leq O(\llbracket r \rrbracket^3)$
 \end{property}
-This property was formalised by Urban, and the details are in the PDERIVS.thy file
-in our repository.
+This property was formalised by Wu et al. \cite{Wu2014}, and the 
+details can be found in the archive of formal proofs. \footnote{https://www.isa-afp.org/entries/Myhill-Nerode.html}
 Once conjecture \ref{bsimpStrongInclusionPder} is proven, then property \ref{pderBound}
-would yield us a cubic bound for our $\blexerStrong$ algorithm: 
+would yield us also a cubic bound for our $\blexerStrong$ algorithm: 
 \begin{conjecture}\label{strongCubic}
 	$\llbracket r \bdersStrong\; s \rrbracket \leq \llbracket r \rrbracket^3$
 \end{conjecture}
+\noindent
+We leave this as future work.
 
 
 %To get all the "atomic" components of a regular expression's possible derivatives,
@@ -748,3 +892,4 @@
 %		$\ntset \; r \; \_ \; [] $ & $ \dn$ & $[]$\\
 %	\end{tabular}
 %\end{center}
+