\documentclass[12pt]{article}
\usepackage{../style}
\usepackage{../langs}
\usepackage{graphicx}
\newtheorem{thm}{Theorem}
\newtheorem{lem}[thm]{Lemma}
\newtheorem{cor}[thm]{Corollary}
\newenvironment{proof}{\paragraph{Proof:}\it}{\hfill$\square$}
\begin{document}
\section*{Antimirov's Proof about Pders}
These are some rough notes about the result by Antimirov establishing
a bound on the number of regular expressions in a partial
derivative. From this bound on the number of partial derivatives one
can easily construct an NFA for a regular expression, but one can also
derive a bound on the size of the partial derivatives. This is what we
do first. We start with the following definitions:
\begin{itemize}
\item $pder\,c\,r$ --- partial derivative according to a character; this can be defined
inductively as follows:
\begin{center}
\begin{tabular}{l@ {\hspace{2mm}}c@ {\hspace{2mm}}l}
$\textit{pder}\, c\, (\ZERO)$ & $\dn$ & $\emptyset$\\
$\textit{pder}\, c\, (\ONE)$ & $\dn$ & $\emptyset$ \\
$\textit{pder}\, c\, (d)$ & $\dn$ & if $c = d$ then $\{\ONE\}$ else $\emptyset$\\
$\textit{pder}\, c\, (r_1 + r_2)$ & $\dn$ & $\textit{pder}\, c\, r_1 \;\cup\; \textit{pder}\, c\, r_2$\\
$\textit{pder}\, c\, (r_1 \cdot r_2)$ & $\dn$ & if $\textit{nullable} (r_1)$\\
& & then $\Pi\,(\textit{pder}\,c\,r_1)\,r_2 \;\cup\; \textit{pder}\, c\, r_2$\\
& & else $\Pi\,(\textit{pder}\, c\, r_1)\,r_2$\\
$\textit{pder}\, c\, (r^*)$ & $\dn$ & $\Pi\,(\textit{pder}\,c\,r)\, (r^*)$
\end{tabular}
\end{center}
\item $pder^+\,c\,\,rs$ --- partial derivatives for a set regular exprssions $rs$
\item $pders\,s\,r$ --- partial derivative of a regular expression
according to a string
\item $Pders\,A\,r \dn \bigcup_{s\in A}. pders\,s\,r$ --- partial
derivatives according to a language (a set of strings)
\item $|rs|$ is the size of a set of regular expressions $rs$, or
the number of elements in the
set (also known as the cardinality of this set)
\item $\prod\,rs\;r \dn \{r_1 \cdot r \;|\; r_1 \in rs\}$ --- this is
some convenience when writing a set of sequence regular
expressions. It essentially ``appends'' the regular expression $r$ to all
regular expressions in the set $rs$. As a result one can write
the sequence case for partial derivatives (see above) more conveniently
as
\[
pder\,c\,(r_1\cdot r_2) \dn
\begin{cases}
\prod\,(pder\,c\,r_1)\,r_2\;\cup\;pder\,c\,r_2 &
\!\!\textit{provided}\,r_1\, \textit{is nullable}\\
\prod\,(pder\,c\,r_1)\,r_2 & \!\!\textit{otherwise}\\
\end{cases}
\]
\item $\textit{Psuf}\,s$ is the set of all non-empty suffixes of $s$ defined as
\[
\textit{PSuf}\, s \dn \{v.\;v \not= [] \wedge \exists u. u \,@\, v = s \}
\]
So for the string $abc$ the non-empty suffixes are $c$, $bc$ and
$abc$. Also we have that
$\textit{Psuf}\,(s\,@\,[c]) = ((\textit{Psuf}\,s)\,@@\,[c]) \cup
\{[c]\}$. Here $@@$ means to concatenate $[c]$ to the end of
all strings in $\textit{Psuf}\,s$; in this equation we also
need to add $\{[c]\}$ in order to make the equation to hold.
\end{itemize}
\noindent
To state Antimirov's result we need the following definition of an
\emph{alphabetic width} of a regular expression defined as follows:
\begin{center}
\begin{tabular}{lcl}
$awidth(\ZERO)$ & $\dn$ & $0$\\
$awidth(\ONE)$ & $\dn$ & $0$\\
$awidth(c)$ & $\dn$ & $1$\\
$awidth(r_1 + r_2)$ & $\dn$ & $awidth(r_1) + awidth(r_2)$\\
$awidth(r_1 \cdot r_2)$ & $\dn$ & $awidth(r_1) + awidth(r_2)$\\
$awidth(r^*)$ & $\dn$ & $awidth(r)$\\
\end{tabular}
\end{center}
\noindent
This function counts how many characters are in a regular expression.
Antimirov's result states
\begin{thm}\label{one}
$\forall\,A\,r\,.\;\;|Pders\;A\;r| \leq awidth(r) + 1$
\end{thm}
\noindent
Note this theorem holds for any set of strings $A$, for example
for the set of all strings, which I will write as $\textit{UNIV}$, and also
for the set $\{s\}$ containing only a single string $s$. Therefore a
simple corollary is
\begin{cor}
$\forall\,s\,r\,.\;\;|pders\;s\;r| \leq awidth(r) + 1$
\end{cor}
\noindent
This property says that for every string $s$, the number of regular
expressions in the derivative can never be bigger than
$awidth(r) + 1$. Interestingly we do not show Thm~\ref{one} for all
sets of strings $A$ directly, but rather only for one particular set of
strings which I call $UNIV_1$. It includes all strings except the empty string
(remember $UNIV$ contains all strings).\bigskip
\noindent
Let's try to give below a comprehensible account of Antimirov's proof
of Thm.~\ref{one}---I distictly remember that Antimirov's paper is
great, but pretty incomprehensible for the first 20+ times one reads
that paper. The proof starts with the following much weaker property
about the size being finite:
\begin{lem}
$\forall\,A\,r\,.\;\;(Pders\;A\;r)$ is a finite set.
\end{lem}
\noindent This lemma is needed because some reasoning steps in
Thm~\ref{one} only work for finite sets, not infinite sets. But let us
skip over the proof of this property at first and let us assume we
know already that the partial derivatives are always finite sets (this for
example does not hold for unsimplified Brzozowski derivatives which
can be infinite for some sets of strings).
There are some central lemmas about partial derivatives for $\cdot$ and $\_^*$.
One is the following
\begin{lem}\label{central}\mbox{}\\
\[Pders\,UNIV_1\,(r_1\cdot r_2) \subseteq (\prod (Pders\,UNIV_1\, r_1)\,r_2) \;
\cup \; Pders\,UNIV_1\,r_2\]
\end{lem}
\begin{proof}
\noindent The proof is done via an induction for the following property
\[
pders\,s\,(r_1\cdot r_2) \subseteq (\prod (pders\,s\, r_1)\,r_2) \;
\cup \; Pders\,(\textit{PSuf}\,s)\,r_2
\]
\noindent
Note that this property uses $pders$ and $Pders$ together. The proof is done
by ``reverse'' induction on $s$, meaning the cases to analyse are the
empty string $[]$ and the case where a character is put at the end of the
string $s$, namely $s \,@\, [c]$. The case $[]$ is trivial. In the other
case we know by IH that
\[
pders\,s\,(r_1\cdot r_2) \subseteq (\prod (pders\,s\, r_1)\,r_2) \;
\cup \; Pders\,(\textit{PSuf}\,s)\,r_2
\]
\noindent
holds for $s$. Then we have to show it holds for $s\,@\,[c]$
\begin{center}
\begin{tabular}{r@{\hspace{2mm}}ll}
& $pders\,(s\,@\,[c])\,(r_1\cdot r_2)$\\
$=$ & $pder^+\,c\,(pders\,s\,(r_1\cdot r_2))$\\
$\subseteq$ & $pder^+\,c\,(\prod (pders\,s\, r_1)\,r_2 \;
\cup \; Pders\,(\textit{PSuf}\,s)\,r_2)$\\
& \hfill{}by IH\\
$=$ & $(pder^+\,c\,(\prod (pders\,s\, r_1)\,r_2))\;\cup\;
(pder^+\,c\,(Pders\,(\textit{PSuf}\,s)\,r_2))$\\
$=$ & $(pder^+\,c\,(\prod (pders\,s\, r_1)\,r_2))\;\cup\;
(Pders\,(\textit{PSuf}\,(s\,@\,[c]))\,r_2)$\\
$\subseteq$ &
$(pder^+\,c\,(\prod (pders\,s\, r_1)\,r_2))\;\cup\;
(pder\,c\,r_2)\;\cup\;
(Pders\,(\textit{PSuf}\,s\,@@\,[c])\,r_2)$\\
$\subseteq$ & $\prod (pder^+\,c\,(pders\,s\, r_1))\,r_2\;\cup\;
(pder\,c\,r_2)\;\cup\;
(Pders\,(\textit{PSuf}\,s\,@@\,[c])\,r_2)$\\
$=$ & $(\prod (pders\,(s\,@\,[c])\, r_1)\,r_2)\;\cup\;
(pder\,c\,r_2)\;\cup\;
(Pders\,(\textit{PSuf}\,s\,@@\,[c])\,r_2)$\\
$\subseteq$ & $(\prod (pders\,(s\,@\,[c])\, r_1)\,r_2)\;\cup\;
(Pders\,(\textit{PSuf}\,(s\,@\,[c]))\,r_2)$\\
\end{tabular}
\end{center}
\noindent The lemma now follows because for an $s \in UNIV_1$ it holds that
\[
\prod\,(pders\,s\,r_1)\,r_2 \subseteq \prod (Pders\,UNIV_1\, r_1)\,r_2
\]
\noindent and
\[
Pders\,(\textit{PSuf}\,s)\,r_2 \subseteq Pders\,UNIV_1\,r_2
\]
\noindent The left-hand sides of the inclusions above are
euqal to $pders\,s\,(r_1\cdot r_2)$ for a string $s\in UNIV_1$.
\end{proof}\medskip
\noindent There is a similar lemma for the $^*$-regular expression, namely:
\begin{lem}\label{centraltwo}
$Pders\,UNIV_1\,(r^*) \subseteq \prod\, (Pders\,UNIV_1\,r)\,(r^*)$
\end{lem}
\noindent
We omit the proof for the moment (similar to Lem~\ref{central}). We
also need the following property about the cardinality of $\prod$:
\begin{lem}\label{centralthree}
$|\prod\,(Pders\,A\,r_1)\,r_2| \le |Pders\,A\,r_1|$
\end{lem}
\noindent
We only need the $\le$ version, which essentially says there
are as many sequences $r\cdot r_2$ as are elements in $r$. Now
for the proof of Thm~\ref{one}: The main induction in
Antimirov's proof establishes that:\footnote{Remember that it is
always the hardest part in an induction proof to find the right
property that is strong enough and of the right shape for the
induction to go through.}
\begin{lem}\label{two}
$\forall r.\;\;|Pders\;UNIV_1\;r| \leq awidth(r)$
\end{lem}
\begin{proof} This is proved by induction on
$r$. The interesting cases are $r_1 + r_2$, $r_1\cdot r_2$ and
$r^*$. Let us start with the relatively simple case:\medskip
\noindent
\textbf{Case $r_1 + r_2$:} By induction hypothesis we know
\begin{center}
\begin{tabular}{l}
$|Pders\;UNIV_1\;r_1| \leq awidth(r_1)$\\
$|Pders\;UNIV_1\;r_2| \leq awidth(r_2)$
\end{tabular}
\end{center}
\noindent
In this case we can reason as follows
\begin{center}
\begin{tabular}{r@{\hspace{2mm}}ll}
& $|Pders\;UNIV_1\;(r_1+r_2)|$\\
$=$ & $|(Pders\;UNIV_1\;r_1) \;\cup\; (Pders\;UNIV_1\;r_2)|$\\
$\leq$ & $|(Pders\;UNIV_1\;r_1)| \;+\; |(Pders\;UNIV_1\;r_2)|$ & (*)\\
$\leq$ & $awidth(r_1) + awidth(r_2)$\\
$\dn$ & $awidth(r_1 + r_2)$
\end{tabular}
\end{center}
\noindent Note that (*) is a step that only works if one knows that
$|(Pders\;UNIV_1\;r_1)|$ and so on are finite. The next case is
a bit more interesting:\medskip
\noindent
\textbf{Case $r_1 \cdot r_2$:} We have the same induction
hypothesis as in the case before.
\begin{center}
\begin{tabular}{r@{\hspace{2mm}}ll}
& $|Pders\;UNIV_1\;(r_1\cdot r_2)|$\\
$\leq$ & $|\prod\,(Pders\;UNIV_1\;r_1)\,r_2\;\cup\; (Pders\;UNIV_1\;r_2)|$
& by Lem~\ref{central}\\
$\leq$ & $|\prod\,(Pders\;UNIV_1\;r_1)\,r_2| \;+\; |(Pders\;UNIV_1\;r_2)|$\\
$\leq$ & $|Pders\;UNIV_1\;r_1| \;+\; |Pders\;UNIV_1\;r_2|$
& by Lem~\ref{centralthree} \\
$\leq$ & $awidth(r_1) + awidth(r_2)$\\
$\dn$ & $awidth(r_1 \cdot r_2)$
\end{tabular}
\end{center} \medskip
\noindent
\textbf{Case $r^*$:} Again we have the same induction
hypothesis as in the cases before.
\begin{center}
\begin{tabular}{r@{\hspace{2mm}}ll}
& $|Pders\;UNIV_1\;(r^*)|$\\
$\leq$ & $|\prod\,(Pders\;UNIV_1\;r)\,(r^*)|$
& by Lem~\ref{centraltwo}\\
$\leq$ & $|Pders\;UNIV_1\;r|$
& by Lem~\ref{centralthree} \\
$\leq$ & $awidth(r)$\\
\end{tabular}
\end{center}
\end{proof}\bigskip
\noindent
From this lemma we can derive the next corrollary which extends
the property to $UNIV$ ($= UNIV_1 \cup \{[]\}$):
\begin{cor}
$\forall r.\;\;|Pders\;UNIV\;r| \leq awidth(r) + 1$
\end{cor}
\begin{proof} This can be proved as follows
\begin{center}
\begin{tabular}{r@{\hspace{2mm}}ll}
& $|Pders\;UNIV\;r|$\\
$=$ & $|Pders\;(UNIV_1 \cup \{[]\})\;r|$\\
$=$ & $|(Pders\;UNIV_1\,r) \;\cup\;\{r\}|$\\
$\leq$ & $|Pders\;UNIV_1\,r| + 1$ & by Lem~\ref{two}\\
$\leq$ & $awidth(r) + 1$\\
\end{tabular}
\end{center}
\end{proof}
\noindent
From the last corollary, it is easy to infer Antimirov's
Thm~\ref{one}, because
\[ Pders\,A\,r \subseteq Pders\,UNIV\,r \]
\noindent
for all sets $A$.\bigskip
\noindent
While I was earlier a bit dismissive above about the intelligibility
of Antimirov's paper, you have to admit this proof is a work of
beauty. It only gives a bound (\textit{awidth}) for the number of
regular expressions in the de\-rivatives---this is important for
constructing NFAs. Maybe it has not been important before, but I have
never seen a result about the \emph{size} of the partial
derivatives.\footnote{Update: I have now seen a paper which proves
this result as well.} However, a very crude bound, namely
$(size(r)^2 + 1) \times (awidth(r) + 1)$, can be easily derived by
using Antimirov's result. The definition we need for this is a
function that collects subexpressions from which partial derivatives
are built:
\begin{center}
\begin{tabular}{lcl}
$subs(\ZERO)$ & $\dn$ & $\{\ZERO\}$\\
$subs(\ONE)$ & $\dn$ & $\{\ONE\}$\\
$subs(c)$ & $\dn$ & $\{c, \ONE\}$\\
$subs(r_1 + r_2)$ & $\dn$ & $\{r_1 + r_2\}\,\cup\,subs(r_1) \,\cup\, subs(r_2)$\\
$subs(r_1 \cdot r_2)$ & $\dn$ & $\{r_1 \cdot r_2\}\,\cup (\prod\,subs(r_1)\;r_2)\,\cup \,
subs(r_1) \,\cup\, subs(r_2)$\\
$subs(r^*)$ & $\dn$ & $\{r^*\}\,\cup\,(\prod\,subs(r)\;r^*) \,\cup\, subs(r)$\\
\end{tabular}
\end{center}
\noindent
We can show that
\begin{lem}
$pders\,s\,r \subseteq subs(r)$
\end{lem}
\noindent
This is a relatively simple induction on $r$. The point is that for every element
in $subs$, the maximum size is given by
\begin{lem}
If $r' \in subs(r)$ then $size(r') \le 1 + size(r)^2$.
\end{lem}
\noindent
Again the proof is a relatively simple induction on $r$. Stringing Antimirov's result
and the lemma above together gives
\begin{thm}
$\sum_{r' \in pders\,s\,r}.\;size(r') \le (size(r)^2 + 1) \times (awidth(r) + 1)$
\end{thm}
\noindent
Since $awidth$ is always smaller than the $size$ of a regular expression,
one can also state the bound as follows:
\[
\sum_{r' \in pders\,s\,r}.\;size(r') \le (size(r) + 1)^3
\]
\noindent
This, by the way, also holds for $Pders$, namely
\[
\sum_{r' \in Pders\,A\,r}.\;size(r') \le (size(r) + 1)^3
\]
\noindent
for all $r$ and $A$. If one is interested in the height of the partial derivatives, one can derive:
\[
\forall\,r' \in pders\,s\,r.\;height(r') \le height(r) + 1
\]
\noindent
meaning the height of the partial derivatives is never bigger than
the height of the original regular expression (+1).
\section*{NFA Construction via Antimirov's Partial Derivatives}
Let's finish these notes with the construction of an NFA for a regular
expression using partial derivatives. As usual an automaton is a
quintuple $(Q, A, \delta, q_0, F)$ where $Q$ is the set of states of
the automaton, $A$ is the alphabet, $q_0$ is the starting state and
$F$ are the accepting states. For DFAs the $\delta$ is a (partial)
function from state $\times$ character to state. For NFAs it is a
relation between state $\times$ character $\times$ state. The
non-determinism can be seen by the following: consider three
(distinct) states $q_1$, $q_2$ and $q_3$, then the relation can
include $(q_1, a, q_2)$ and $(q_1, a, q_3)$, which means there is a
transition between $q_1$ and both $q_2$ and $q_3$ for the character
$a$.
The Antimirov's NFA for a regular expression $r$ is then
given by the quintuple
\[(PD(r), A, \delta_{PD}, r, F)\]
\noindent
where $PD(r)$ are all the partial
derivatives according to all strings, that is
\[
PD(r) \;\dn\; \textit{Pders}\;\textit{UNIV}\;r
\]
\noindent
Because of the previous proof, we know that this set is finite. We
also see that the states in Antimirov's NFA are ``labelled'' by single
regular expressions. The starting state is labelled with the original
regular expression $r$. The set of accepting states $F$ is all states
$r'\in F$ where $r'$ is nullable. The relation $\delta_{PD}$ is given by
\[
(r_1, c, r_2)
\]
\noindent
for every $r_1 \in PD(r)$ and $r_2 \in \textit{pder}\,c\,r$. This is
in general a ``non-deterministic'' relation because the set of partial
derivatives often contains more than one element. A nice example of
an NFA constructed via Antimirov's partial derivatives is given in
\cite{IlieYu2003} on Page 378.
The difficulty of course in this construction is to find the set of
partial derivatives according to \emph{all} strings. However, it seem
a procedure that enumerates strings according to size suffices until
no new derivative is found. There are various improvements that apply
clever tricks on how to more efficiently discover this set.
\begin{thebibliography}{999}
\bibitem{IlieYu2003}
L.~Ilie and S.~Yu,
\emph{Reducing NFAs by Invariant Equivalences}.
In Theoretical Computer Science, Volume 306(1--3), Pages 373–-390, 2003.\\
\url{https://core.ac.uk/download/pdf/82545723.pdf}
\end{thebibliography}
\end{document}
%%% Local Variables:
%%% mode: latex
%%% TeX-master: t
%%% End: