--- a/ChengsongTanPhdThesis/Chapters/Finite.tex Wed Aug 31 23:57:42 2022 +0100
+++ b/ChengsongTanPhdThesis/Chapters/Finite.tex Thu Sep 01 23:47:37 2022 +0100
@@ -2007,108 +2007,10 @@
\multicolumn{3}{c}{Graphs: size change of 3 randomly generated regular expressions $w.r.t.$ input string length.}
\end{tabular}
\end{center}
-
-
-
-
-
\noindent
Most of the regex's sizes seem to stay within a polynomial bound $w.r.t$ the
original size.
-This suggests a link towrads "partial derivatives"
- introduced by Antimirov \cite{Antimirov95}.
-
- \section{Antimirov's partial derivatives}
- The idea behind Antimirov's partial derivatives
-is to do derivatives in a similar way as suggested by Brzozowski,
-but maintain a set of regular expressions instead of a single one:
-
-%TODO: antimirov proposition 3.1, needs completion
- \begin{center}
- \begin{tabular}{ccc}
- $\partial_x(a+b)$ & $=$ & $\partial_x(a) \cup \partial_x(b)$\\
-$\partial_x(\ONE)$ & $=$ & $\phi$
-\end{tabular}
-\end{center}
-
-Rather than joining the calculated derivatives $\partial_x a$ and $\partial_x b$ together
-using the alternatives constructor, Antimirov cleverly chose to put them into
-a set instead. This breaks the terms in a derivative regular expression up,
-allowing us to understand what are the "atomic" components of it.
-For example, To compute what regular expression $x^*(xx + y)^*$'s
-derivative against $x$ is made of, one can do a partial derivative
-of it and get two singleton sets $\{x^* \cdot (xx + y)^*\}$ and $\{x \cdot (xx + y) ^* \}$
-from $\partial_x(x^*) \cdot (xx + y) ^*$ and $\partial_x((xx + y)^*)$.
-To get all the "atomic" components of a regular expression's possible derivatives,
-there is a procedure Antimirov called $\textit{lf}$, short for "linear forms", that takes
-whatever character is available at the head of the string inside the language of a
-regular expression, and gives back the character and the derivative regular expression
-as a pair (which he called "monomial"):
- \begin{center}
- \begin{tabular}{ccc}
- $\lf(\ONE)$ & $=$ & $\phi$\\
-$\lf(c)$ & $=$ & $\{(c, \ONE) \}$\\
- $\lf(a+b)$ & $=$ & $\lf(a) \cup \lf(b)$\\
- $\lf(r^*)$ & $=$ & $\lf(r) \bigodot \lf(r^*)$\\
-\end{tabular}
-\end{center}
-%TODO: completion
-
-There is a slight difference in the last three clauses compared
-with $\partial$: instead of a dot operator $ \textit{rset} \cdot r$ that attaches the regular
-expression $r$ with every element inside $\textit{rset}$ to create a set of
-sequence derivatives, it uses the "circle dot" operator $\bigodot$ which operates
-on a set of monomials (which Antimirov called "linear form") and a regular
-expression, and returns a linear form:
- \begin{center}
- \begin{tabular}{ccc}
- $l \bigodot (\ZERO)$ & $=$ & $\phi$\\
- $l \bigodot (\ONE)$ & $=$ & $l$\\
- $\phi \bigodot t$ & $=$ & $\phi$\\
- $\{ (x, \ZERO) \} \bigodot t$ & $=$ & $\{(x,\ZERO) \}$\\
- $\{ (x, \ONE) \} \bigodot t$ & $=$ & $\{(x,t) \}$\\
- $\{ (x, p) \} \bigodot t$ & $=$ & $\{(x,p\cdot t) \}$\\
- $\lf(a+b)$ & $=$ & $\lf(a) \cup \lf(b)$\\
- $\lf(r^*)$ & $=$ & $\lf(r) \cdot \lf(r^*)$\\
-\end{tabular}
-\end{center}
-%TODO: completion
-
- Some degree of simplification is applied when doing $\bigodot$, for example,
- $l \bigodot (\ZERO) = \phi$ corresponds to $r \cdot \ZERO \rightsquigarrow \ZERO$,
- and $l \bigodot (\ONE) = l$ to $l \cdot \ONE \rightsquigarrow l$, and
- $\{ (x, \ZERO) \} \bigodot t = \{(x,\ZERO) \}$ to $\ZERO \cdot x \rightsquigarrow \ZERO$,
- and so on.
-
- With the function $\lf$ one can compute all possible partial derivatives $\partial_{UNIV}(r)$ of a regex $r$ with
- an iterative procedure:
- \begin{center}
- \begin{tabular}{llll}
-$\textit{while}$ & $(\Delta_i \neq \phi)$ & & \\
- & $\Delta_{i+1}$ & $ =$ & $\lf(\Delta_i) - \PD_i$ \\
- & $\PD_{i+1}$ & $ =$ & $\Delta_{i+1} \cup \PD_i$ \\
-$\partial_{UNIV}(r)$ & $=$ & $\PD$ &
-\end{tabular}
-\end{center}
-
-
- $(r_1 + r_2) \cdot r_3 \longrightarrow (r_1 \cdot r_3) + (r_2 \cdot r_3)$,
-
-
-However, if we introduce them in our
-setting we would lose the POSIX property of our calculated values.
-A simple example for this would be the regex $(a + a\cdot b)\cdot(b\cdot c + c)$.
-If we split this regex up into $a\cdot(b\cdot c + c) + a\cdot b \cdot (b\cdot c + c)$, the lexer
-would give back $\Left(\Seq(\Char(a), \Left(\Char(b \cdot c))))$ instead of
-what we want: $\Seq(\Right(ab), \Right(c))$. Unless we can store the structural information
-in all the places where a transformation of the form $(r_1 + r_2)\cdot r \rightarrow r_1 \cdot r + r_2 \cdot r$
-occurs, and apply them in the right order once we get
-a result of the "aggressively simplified" regex, it would be impossible to still get a $\POSIX$ value.
-This is unlike the simplification we had before, where the rewriting rules
-such as $\ONE \cdot r \rightsquigarrow r$, under which our lexer will give the same value.
-We will discuss better
-bounds in the last section of this chapter.\\[-6.5mm]
-
+We will discuss improvements to this bound in the next chapter.