981
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1 |
\documentclass[12pt]{article}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
2 |
\usepackage{../style}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
3 |
\usepackage{../langs}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
4 |
\usepackage{graphicx}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
5 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
6 |
\newtheorem{thm}{Theorem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
7 |
\newtheorem{lem}[thm]{Lemma}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
8 |
\newtheorem{cor}[thm]{Corollary}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
9 |
\newenvironment{proof}{\paragraph{Proof:}\it}{\hfill$\square$}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
10 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
11 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
12 |
\begin{document}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
13 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
14 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
15 |
\section*{Antimirov's Proof about Pders}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
16 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
17 |
These are some rough notes about the result by Antimirov establishing
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
18 |
a bound on the number of regular expressions in a partial
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
19 |
derivative. From this bound on the number of partial derivatives one
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
20 |
can easily construct an NFA for a regular expression, but one can also
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
21 |
derive a bound on the size of the partial derivatives. This is what we
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
22 |
do first. We start with the following definitions:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
23 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
24 |
\begin{itemize}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
25 |
\item $pder\,c\,r$ --- partial derivative according to a character; this can be defined
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
26 |
inductively as follows:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
27 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
28 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
29 |
\begin{tabular}{l@ {\hspace{2mm}}c@ {\hspace{2mm}}l}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
30 |
$\textit{pder}\, c\, (\ZERO)$ & $\dn$ & $\emptyset$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
31 |
$\textit{pder}\, c\, (\ONE)$ & $\dn$ & $\emptyset$ \\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
32 |
$\textit{pder}\, c\, (d)$ & $\dn$ & if $c = d$ then $\{\ONE\}$ else $\emptyset$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
33 |
$\textit{pder}\, c\, (r_1 + r_2)$ & $\dn$ & $\textit{pder}\, c\, r_1 \;\cup\; \textit{pder}\, c\, r_2$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
34 |
$\textit{pder}\, c\, (r_1 \cdot r_2)$ & $\dn$ & if $\textit{nullable} (r_1)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
35 |
& & then $\Pi\,(\textit{pder}\,c\,r_1)\,r_2 \;\cup\; \textit{pder}\, c\, r_2$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
36 |
& & else $\Pi\,(\textit{pder}\, c\, r_1)\,r_2$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
37 |
$\textit{pder}\, c\, (r^*)$ & $\dn$ & $\Pi\,(\textit{pder}\,c\,r)\, (r^*)$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
38 |
\end{tabular}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
39 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
40 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
41 |
\item $pder^+\,c\,\,rs$ --- partial derivatives for a set regular exprssions $rs$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
42 |
\item $pders\,s\,r$ --- partial derivative of a regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
43 |
according to a string
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
44 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
45 |
\item $Pders\,A\,r \dn \bigcup_{s\in A}. pders\,s\,r$ --- partial
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
46 |
derivatives according to a language (a set of strings)
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
47 |
\item $|rs|$ is the size of a set of regular expressions $rs$, or
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
48 |
the number of elements in the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
49 |
set (also known as the cardinality of this set)
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
50 |
\item $\prod\,rs\;r \dn \{r_1 \cdot r \;|\; r_1 \in rs\}$ --- this is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
51 |
some convenience when writing a set of sequence regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
52 |
expressions. It essentially ``appends'' the regular expression $r$ to all
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
53 |
regular expressions in the set $rs$. As a result one can write
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
54 |
the sequence case for partial derivatives (see above) more conveniently
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
55 |
as
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
56 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
57 |
pder\,c\,(r_1\cdot r_2) \dn
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
58 |
\begin{cases}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
59 |
\prod\,(pder\,c\,r_1)\,r_2\;\cup\;pder\,c\,r_2 &
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
60 |
\!\!\textit{provided}\,r_1\, \textit{is nullable}\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
61 |
\prod\,(pder\,c\,r_1)\,r_2 & \!\!\textit{otherwise}\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
62 |
\end{cases}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
63 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
64 |
\item $\textit{Psuf}\,s$ is the set of all non-empty suffixes of $s$ defined as
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
65 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
66 |
\textit{PSuf}\, s \dn \{v.\;v \not= [] \wedge \exists u. u \,@\, v = s \}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
67 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
68 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
69 |
So for the string $abc$ the non-empty suffixes are $c$, $bc$ and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
70 |
$abc$. Also we have that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
71 |
$\textit{Psuf}\,(s\,@\,[c]) = ((\textit{Psuf}\,s)\,@@\,[c]) \cup
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
72 |
\{[c]\}$. Here $@@$ means to concatenate $[c]$ to the end of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
73 |
all strings in $\textit{Psuf}\,s$; in this equation we also
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
74 |
need to add $\{[c]\}$ in order to make the equation to hold.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
75 |
\end{itemize}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
76 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
77 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
78 |
To state Antimirov's result we need the following definition of an
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
79 |
\emph{alphabetic width} of a regular expression defined as follows:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
80 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
81 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
82 |
\begin{tabular}{lcl}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
83 |
$awidth(\ZERO)$ & $\dn$ & $0$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
84 |
$awidth(\ONE)$ & $\dn$ & $0$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
85 |
$awidth(c)$ & $\dn$ & $1$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
86 |
$awidth(r_1 + r_2)$ & $\dn$ & $awidth(r_1) + awidth(r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
87 |
$awidth(r_1 \cdot r_2)$ & $\dn$ & $awidth(r_1) + awidth(r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
88 |
$awidth(r^*)$ & $\dn$ & $awidth(r)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
89 |
\end{tabular}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
90 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
91 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
92 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
93 |
This function counts how many characters are in a regular expression.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
94 |
Antimirov's result states
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
95 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
96 |
\begin{thm}\label{one}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
97 |
$\forall\,A\,r\,.\;\;|Pders\;A\;r| \leq awidth(r) + 1$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
98 |
\end{thm}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
99 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
100 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
101 |
Note this theorem holds for any set of strings $A$, for example
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
102 |
for the set of all strings, which I will write as $\textit{UNIV}$, and also
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
103 |
for the set $\{s\}$ containing only a single string $s$. Therefore a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
104 |
simple corollary is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
105 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
106 |
\begin{cor}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
107 |
$\forall\,s\,r\,.\;\;|pders\;s\;r| \leq awidth(r) + 1$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
108 |
\end{cor}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
109 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
110 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
111 |
This property says that for every string $s$, the number of regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
112 |
expressions in the derivative can never be bigger than
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
113 |
$awidth(r) + 1$. Interestingly we do not show Thm~\ref{one} for all
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
114 |
sets of strings $A$ directly, but rather only for one particular set of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
115 |
strings which I call $UNIV_1$. It includes all strings except the empty string
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
116 |
(remember $UNIV$ contains all strings).\bigskip
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
117 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
118 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
119 |
Let's try to give below a comprehensible account of Antimirov's proof
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
120 |
of Thm.~\ref{one}---I distictly remember that Antimirov's paper is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
121 |
great, but pretty incomprehensible for the first 20+ times one reads
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
122 |
that paper. The proof starts with the following much weaker property
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
123 |
about the size being finite:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
124 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
125 |
\begin{lem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
126 |
$\forall\,A\,r\,.\;\;(Pders\;A\;r)$ is a finite set.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
127 |
\end{lem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
128 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
129 |
\noindent This lemma is needed because some reasoning steps in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
130 |
Thm~\ref{one} only work for finite sets, not infinite sets. But let us
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
131 |
skip over the proof of this property at first and let us assume we
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
132 |
know already that the partial derivatives are always finite sets (this for
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
133 |
example does not hold for unsimplified Brzozowski derivatives which
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
134 |
can be infinite for some sets of strings).
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
135 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
136 |
There are some central lemmas about partial derivatives for $\cdot$ and $\_^*$.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
137 |
One is the following
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
138 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
139 |
\begin{lem}\label{central}\mbox{}\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
140 |
\[Pders\,UNIV_1\,(r_1\cdot r_2) \subseteq (\prod (Pders\,UNIV_1\, r_1)\,r_2) \;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
141 |
\cup \; Pders\,UNIV_1\,r_2\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
142 |
\end{lem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
143 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
144 |
\begin{proof}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
145 |
\noindent The proof is done via an induction for the following property
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
146 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
147 |
pders\,s\,(r_1\cdot r_2) \subseteq (\prod (pders\,s\, r_1)\,r_2) \;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
148 |
\cup \; Pders\,(\textit{PSuf}\,s)\,r_2
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
149 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
150 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
151 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
152 |
Note that this property uses $pders$ and $Pders$ together. The proof is done
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
153 |
by ``reverse'' induction on $s$, meaning the cases to analyse are the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
154 |
empty string $[]$ and the case where a character is put at the end of the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
155 |
string $s$, namely $s \,@\, [c]$. The case $[]$ is trivial. In the other
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
156 |
case we know by IH that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
157 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
158 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
159 |
pders\,s\,(r_1\cdot r_2) \subseteq (\prod (pders\,s\, r_1)\,r_2) \;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
160 |
\cup \; Pders\,(\textit{PSuf}\,s)\,r_2
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
161 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
162 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
163 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
164 |
holds for $s$. Then we have to show it holds for $s\,@\,[c]$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
165 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
166 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
167 |
\begin{tabular}{r@{\hspace{2mm}}ll}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
168 |
& $pders\,(s\,@\,[c])\,(r_1\cdot r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
169 |
$=$ & $pder^+\,c\,(pders\,s\,(r_1\cdot r_2))$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
170 |
$\subseteq$ & $pder^+\,c\,(\prod (pders\,s\, r_1)\,r_2 \;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
171 |
\cup \; Pders\,(\textit{PSuf}\,s)\,r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
172 |
& \hfill{}by IH\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
173 |
$=$ & $(pder^+\,c\,(\prod (pders\,s\, r_1)\,r_2))\;\cup\;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
174 |
(pder^+\,c\,(Pders\,(\textit{PSuf}\,s)\,r_2))$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
175 |
$=$ & $(pder^+\,c\,(\prod (pders\,s\, r_1)\,r_2))\;\cup\;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
176 |
(Pders\,(\textit{PSuf}\,(s\,@\,[c]))\,r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
177 |
$\subseteq$ &
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
178 |
$(pder^+\,c\,(\prod (pders\,s\, r_1)\,r_2))\;\cup\;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
179 |
(pder\,c\,r_2)\;\cup\;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
180 |
(Pders\,(\textit{PSuf}\,s\,@@\,[c])\,r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
181 |
$\subseteq$ & $\prod (pder^+\,c\,(pders\,s\, r_1))\,r_2\;\cup\;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
182 |
(pder\,c\,r_2)\;\cup\;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
183 |
(Pders\,(\textit{PSuf}\,s\,@@\,[c])\,r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
184 |
$=$ & $(\prod (pders\,(s\,@\,[c])\, r_1)\,r_2)\;\cup\;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
185 |
(pder\,c\,r_2)\;\cup\;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
186 |
(Pders\,(\textit{PSuf}\,s\,@@\,[c])\,r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
187 |
$\subseteq$ & $(\prod (pders\,(s\,@\,[c])\, r_1)\,r_2)\;\cup\;
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
188 |
(Pders\,(\textit{PSuf}\,(s\,@\,[c]))\,r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
189 |
\end{tabular}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
190 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
191 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
192 |
\noindent The lemma now follows because for an $s \in UNIV_1$ it holds that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
193 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
194 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
195 |
\prod\,(pders\,s\,r_1)\,r_2 \subseteq \prod (Pders\,UNIV_1\, r_1)\,r_2
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
196 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
197 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
198 |
\noindent and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
199 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
200 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
201 |
Pders\,(\textit{PSuf}\,s)\,r_2 \subseteq Pders\,UNIV_1\,r_2
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
202 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
203 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
204 |
\noindent The left-hand sides of the inclusions above are
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
205 |
euqal to $pders\,s\,(r_1\cdot r_2)$ for a string $s\in UNIV_1$.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
206 |
\end{proof}\medskip
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
207 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
208 |
\noindent There is a similar lemma for the $^*$-regular expression, namely:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
209 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
210 |
\begin{lem}\label{centraltwo}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
211 |
$Pders\,UNIV_1\,(r^*) \subseteq \prod\, (Pders\,UNIV_1\,r)\,(r^*)$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
212 |
\end{lem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
213 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
214 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
215 |
We omit the proof for the moment (similar to Lem~\ref{central}). We
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
216 |
also need the following property about the cardinality of $\prod$:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
217 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
218 |
\begin{lem}\label{centralthree}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
219 |
$|\prod\,(Pders\,A\,r_1)\,r_2| \le |Pders\,A\,r_1|$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
220 |
\end{lem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
221 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
222 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
223 |
We only need the $\le$ version, which essentially says there
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
224 |
are as many sequences $r\cdot r_2$ as are elements in $r$. Now
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
225 |
for the proof of Thm~\ref{one}: The main induction in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
226 |
Antimirov's proof establishes that:\footnote{Remember that it is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
227 |
always the hardest part in an induction proof to find the right
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
228 |
property that is strong enough and of the right shape for the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
229 |
induction to go through.}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
230 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
231 |
\begin{lem}\label{two}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
232 |
$\forall r.\;\;|Pders\;UNIV_1\;r| \leq awidth(r)$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
233 |
\end{lem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
234 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
235 |
\begin{proof} This is proved by induction on
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
236 |
$r$. The interesting cases are $r_1 + r_2$, $r_1\cdot r_2$ and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
237 |
$r^*$. Let us start with the relatively simple case:\medskip
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
238 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
239 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
240 |
\textbf{Case $r_1 + r_2$:} By induction hypothesis we know
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
241 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
242 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
243 |
\begin{tabular}{l}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
244 |
$|Pders\;UNIV_1\;r_1| \leq awidth(r_1)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
245 |
$|Pders\;UNIV_1\;r_2| \leq awidth(r_2)$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
246 |
\end{tabular}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
247 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
248 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
249 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
250 |
In this case we can reason as follows
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
251 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
252 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
253 |
\begin{tabular}{r@{\hspace{2mm}}ll}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
254 |
& $|Pders\;UNIV_1\;(r_1+r_2)|$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
255 |
$=$ & $|(Pders\;UNIV_1\;r_1) \;\cup\; (Pders\;UNIV_1\;r_2)|$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
256 |
$\leq$ & $|(Pders\;UNIV_1\;r_1)| \;+\; |(Pders\;UNIV_1\;r_2)|$ & (*)\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
257 |
$\leq$ & $awidth(r_1) + awidth(r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
258 |
$\dn$ & $awidth(r_1 + r_2)$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
259 |
\end{tabular}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
260 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
261 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
262 |
\noindent Note that (*) is a step that only works if one knows that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
263 |
$|(Pders\;UNIV_1\;r_1)|$ and so on are finite. The next case is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
264 |
a bit more interesting:\medskip
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
265 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
266 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
267 |
\textbf{Case $r_1 \cdot r_2$:} We have the same induction
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
268 |
hypothesis as in the case before.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
269 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
270 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
271 |
\begin{tabular}{r@{\hspace{2mm}}ll}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
272 |
& $|Pders\;UNIV_1\;(r_1\cdot r_2)|$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
273 |
$\leq$ & $|\prod\,(Pders\;UNIV_1\;r_1)\,r_2\;\cup\; (Pders\;UNIV_1\;r_2)|$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
274 |
& by Lem~\ref{central}\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
275 |
$\leq$ & $|\prod\,(Pders\;UNIV_1\;r_1)\,r_2| \;+\; |(Pders\;UNIV_1\;r_2)|$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
276 |
$\leq$ & $|Pders\;UNIV_1\;r_1| \;+\; |Pders\;UNIV_1\;r_2|$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
277 |
& by Lem~\ref{centralthree} \\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
278 |
$\leq$ & $awidth(r_1) + awidth(r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
279 |
$\dn$ & $awidth(r_1 \cdot r_2)$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
280 |
\end{tabular}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
281 |
\end{center} \medskip
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
282 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
283 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
284 |
\textbf{Case $r^*$:} Again we have the same induction
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
285 |
hypothesis as in the cases before.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
286 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
287 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
288 |
\begin{tabular}{r@{\hspace{2mm}}ll}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
289 |
& $|Pders\;UNIV_1\;(r^*)|$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
290 |
$\leq$ & $|\prod\,(Pders\;UNIV_1\;r)\,(r^*)|$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
291 |
& by Lem~\ref{centraltwo}\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
292 |
$\leq$ & $|Pders\;UNIV_1\;r|$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
293 |
& by Lem~\ref{centralthree} \\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
294 |
$\leq$ & $awidth(r)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
295 |
\end{tabular}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
296 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
297 |
\end{proof}\bigskip
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
298 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
299 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
300 |
From this lemma we can derive the next corrollary which extends
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
301 |
the property to $UNIV$ ($= UNIV_1 \cup \{[]\}$):
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
302 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
303 |
\begin{cor}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
304 |
$\forall r.\;\;|Pders\;UNIV\;r| \leq awidth(r) + 1$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
305 |
\end{cor}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
306 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
307 |
\begin{proof} This can be proved as follows
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
308 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
309 |
\begin{tabular}{r@{\hspace{2mm}}ll}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
310 |
& $|Pders\;UNIV\;r|$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
311 |
$=$ & $|Pders\;(UNIV_1 \cup \{[]\})\;r|$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
312 |
$=$ & $|(Pders\;UNIV_1\,r) \;\cup\;\{r\}|$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
313 |
$\leq$ & $|Pders\;UNIV_1\,r| + 1$ & by Lem~\ref{two}\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
314 |
$\leq$ & $awidth(r) + 1$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
315 |
\end{tabular}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
316 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
317 |
\end{proof}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
318 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
319 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
320 |
From the last corollary, it is easy to infer Antimirov's
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
321 |
Thm~\ref{one}, because
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
322 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
323 |
\[ Pders\,A\,r \subseteq Pders\,UNIV\,r \]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
324 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
325 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
326 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
327 |
for all sets $A$.\bigskip
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
328 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
329 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
330 |
While I was earlier a bit dismissive above about the intelligibility
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
331 |
of Antimirov's paper, you have to admit this proof is a work of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
332 |
beauty. It only gives a bound (\textit{awidth}) for the number of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
333 |
regular expressions in the de\-rivatives---this is important for
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
334 |
constructing NFAs. Maybe it has not been important before, but I have
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
335 |
never seen a result about the \emph{size} of the partial
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
336 |
derivatives.\footnote{Update: I have now seen a paper which proves
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
337 |
this result as well.} However, a very crude bound, namely
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
338 |
$(size(r)^2 + 1) \times (awidth(r) + 1)$, can be easily derived by
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
339 |
using Antimirov's result. The definition we need for this is a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
340 |
function that collects subexpressions from which partial derivatives
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
341 |
are built:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
342 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
343 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
344 |
\begin{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
345 |
\begin{tabular}{lcl}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
346 |
$subs(\ZERO)$ & $\dn$ & $\{\ZERO\}$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
347 |
$subs(\ONE)$ & $\dn$ & $\{\ONE\}$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
348 |
$subs(c)$ & $\dn$ & $\{c, \ONE\}$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
349 |
$subs(r_1 + r_2)$ & $\dn$ & $\{r_1 + r_2\}\,\cup\,subs(r_1) \,\cup\, subs(r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
350 |
$subs(r_1 \cdot r_2)$ & $\dn$ & $\{r_1 \cdot r_2\}\,\cup (\prod\,subs(r_1)\;r_2)\,\cup \,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
351 |
subs(r_1) \,\cup\, subs(r_2)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
352 |
$subs(r^*)$ & $\dn$ & $\{r^*\}\,\cup\,(\prod\,subs(r)\;r^*) \,\cup\, subs(r)$\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
353 |
\end{tabular}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
354 |
\end{center}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
355 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
356 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
357 |
We can show that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
358 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
359 |
\begin{lem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
360 |
$pders\,s\,r \subseteq subs(r)$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
361 |
\end{lem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
362 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
363 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
364 |
This is a relatively simple induction on $r$. The point is that for every element
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
365 |
in $subs$, the maximum size is given by
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
366 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
367 |
\begin{lem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
368 |
If $r' \in subs(r)$ then $size(r') \le 1 + size(r)^2$.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
369 |
\end{lem}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
370 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
371 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
372 |
Again the proof is a relatively simple induction on $r$. Stringing Antimirov's result
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
373 |
and the lemma above together gives
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
374 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
375 |
\begin{thm}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
376 |
$\sum_{r' \in pders\,s\,r}.\;size(r') \le (size(r)^2 + 1) \times (awidth(r) + 1)$
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
377 |
\end{thm}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
378 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
379 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
380 |
Since $awidth$ is always smaller than the $size$ of a regular expression,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
381 |
one can also state the bound as follows:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
382 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
383 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
384 |
\sum_{r' \in pders\,s\,r}.\;size(r') \le (size(r) + 1)^3
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
385 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
386 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
387 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
388 |
This, by the way, also holds for $Pders$, namely
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
389 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
390 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
391 |
\sum_{r' \in Pders\,A\,r}.\;size(r') \le (size(r) + 1)^3
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
392 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
393 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
394 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
395 |
for all $r$ and $A$. If one is interested in the height of the partial derivatives, one can derive:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
396 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
397 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
398 |
\forall\,r' \in pders\,s\,r.\;height(r') \le height(r) + 1
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
399 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
400 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
401 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
402 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
403 |
meaning the height of the partial derivatives is never bigger than
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
404 |
the height of the original regular expression (+1).
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
405 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
406 |
\section*{NFA Construction via Antimirov's Partial Derivatives}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
407 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
408 |
Let's finish these notes with the construction of an NFA for a regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
409 |
expression using partial derivatives. As usual an automaton is a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
410 |
quintuple $(Q, A, \delta, q_0, F)$ where $Q$ is the set of states of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
411 |
the automaton, $A$ is the alphabet, $q_0$ is the starting state and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
412 |
$F$ are the accepting states. For DFAs the $\delta$ is a (partial)
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
413 |
function from state $\times$ character to state. For NFAs it is a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
414 |
relation between state $\times$ character $\times$ state. The
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
415 |
non-determinism can be seen by the following: consider three
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
416 |
(distinct) states $q_1$, $q_2$ and $q_3$, then the relation can
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
417 |
include $(q_1, a, q_2)$ and $(q_1, a, q_3)$, which means there is a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
418 |
transition between $q_1$ and both $q_2$ and $q_3$ for the character
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
419 |
$a$.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
420 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
421 |
The Antimirov's NFA for a regular expression $r$ is then
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
422 |
given by the quintuple
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
423 |
\[(PD(r), A, \delta_{PD}, r, F)\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
424 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
425 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
426 |
where $PD(r)$ are all the partial
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
427 |
derivatives according to all strings, that is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
428 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
429 |
PD(r) \;\dn\; \textit{Pders}\;\textit{UNIV}\;r
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
430 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
431 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
432 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
433 |
Because of the previous proof, we know that this set is finite. We
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
434 |
also see that the states in Antimirov's NFA are ``labelled'' by single
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
435 |
regular expressions. The starting state is labelled with the original
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
436 |
regular expression $r$. The set of accepting states $F$ is all states
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
437 |
$r'\in F$ where $r'$ is nullable. The relation $\delta_{PD}$ is given by
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
438 |
\[
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
439 |
(r_1, c, r_2)
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
440 |
\]
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
441 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
442 |
\noindent
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
443 |
for every $r_1 \in PD(r)$ and $r_2 \in \textit{pder}\,c\,r$. This is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
444 |
in general a ``non-deterministic'' relation because the set of partial
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
445 |
derivatives often contains more than one element. A nice example of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
446 |
an NFA constructed via Antimirov's partial derivatives is given in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
447 |
\cite{IlieYu2003} on Page 378.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
448 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
449 |
The difficulty of course in this construction is to find the set of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
450 |
partial derivatives according to \emph{all} strings. However, it seem
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
451 |
a procedure that enumerates strings according to size suffices until
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
452 |
no new derivative is found. There are various improvements that apply
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
453 |
clever tricks on how to more efficiently discover this set.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
454 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
455 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
456 |
\begin{thebibliography}{999}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
457 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
458 |
\bibitem{IlieYu2003}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
459 |
L.~Ilie and S.~Yu,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
460 |
\emph{Reducing NFAs by Invariant Equivalences}.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
461 |
In Theoretical Computer Science, Volume 306(1--3), Pages 373–-390, 2003.\\
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
462 |
\url{https://core.ac.uk/download/pdf/82545723.pdf}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
463 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
464 |
\end{thebibliography}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
465 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
466 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
467 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
468 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
469 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
470 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
471 |
\end{document}
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
472 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
473 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
474 |
%%% Local Variables:
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
475 |
%%% mode: latex
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
476 |
%%% TeX-master: t
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
477 |
%%% End:
|