| author | Christian Urban <christian.urban@kcl.ac.uk> | 
| Fri, 24 Oct 2025 08:55:14 +0100 | |
| changeset 1016 | c02d409ed7f4 | 
| parent 953 | 5e070fb0332a | 
| permissions | -rw-r--r-- | 
| 20 | 1 | \documentclass{article}
 | 
| 402 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 2 | \usepackage{../style}
 | 
| 20 | 3 | |
| 4 | \begin{document}
 | |
| 5 | ||
| 6 | \section*{Proof}
 | |
| 7 | ||
| 8 | Recall the definitions for regular expressions and the language associated with a regular expression: | |
| 9 | ||
| 10 | \begin{center}
 | |
| 11 | \begin{tabular}{c}
 | |
| 12 | \begin{tabular}[t]{rcl}
 | |
| 402 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 13 | $r$ & $::=$ & $\ZERO$ \\ | 
| 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 14 | & $\mid$ & $\ONE$ \\ | 
| 20 | 15 | & $\mid$ & $c$ \\ | 
| 16 | & $\mid$ & $r_1 \cdot r_2$ \\ | |
| 17 | & $\mid$ & $r_1 + r_2$ \\ | |
| 18 | & $\mid$ & $r^*$ \\ | |
| 19 |   \end{tabular}\hspace{10mm}
 | |
| 20 | \begin{tabular}[t]{r@{\hspace{1mm}}c@{\hspace{1mm}}l}
 | |
| 402 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 21 | $L(\ZERO)$ & $\dn$ & $\varnothing$ \\ | 
| 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 22 | $L(\ONE)$       & $\dn$ & $\{\texttt{""}\}$       \\
 | 
| 20 | 23 | $L(c)$                   & $\dn$ & $\{\texttt{"}c\texttt{"}\}$                         \\
 | 
| 24 | $L(r_1 \cdot r_2)$ & $\dn$ & $L(r_1) \,@\, L(r_2)$ \\ | |
| 25 | $L(r_1 + r_2)$ & $\dn$ & $L(r_1) \cup L(r_2)$ \\ | |
| 26 |  $L(r^*)$        & $\dn$ & $\bigcup_{n\ge 0} L(r)^n$                 \\
 | |
| 27 |   \end{tabular}
 | |
| 28 | \end{tabular}
 | |
| 29 | \end{center}
 | |
| 30 | ||
| 31 | \noindent | |
| 32 | We also defined the notion of a derivative of a regular expression (the derivative with respect to a character): | |
| 33 | ||
| 34 | \begin{center}
 | |
| 35 | \begin{tabular}{lcl}
 | |
| 402 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 36 | $der\, c\, (\ZERO)$ & $\dn$ & $\ZERO$ \\ | 
| 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 37 | $der\, c\, (\ONE)$ & $\dn$ & $\ZERO$ \\ | 
| 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 38 | $der\, c\, (d)$ & $\dn$ & if $c = d$ then $\ONE$ else $\ZERO$\\ | 
| 20 | 39 | $der\, c\, (r_1 + r_2)$ & $\dn$ & $(der\, c\, r_1) + (der\, c\, r_2)$ \\ | 
| 40 | $der\, c\, (r_1 \cdot r_2)$ & $\dn$ & if $nullable(r_1)$\\ | |
| 41 | & & then $((der\, c\, r_1) \cdot r_2) + (der\, c\, r_2)$\\ | |
| 42 | & & else $(der\, c\, r_1) \cdot r_2$\\ | |
| 43 | $der\, c\, (r^*)$ & $\dn$ & $(der\, c\, r) \cdot (r^*)$\\ | |
| 44 |   \end{tabular}
 | |
| 45 | \end{center}
 | |
| 46 | ||
| 47 | \noindent | |
| 48 | With our definition of regular expressions comes an induction principle. Given a property $P$ over | |
| 49 | regular expressions. We can establish that $\forall r.\; P(r)$ holds, provided we can show the following: | |
| 50 | ||
| 51 | \begin{enumerate}
 | |
| 402 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 52 | \item $P(\ZERO)$, $P(\ONE)$ and $P(c)$ all hold, | 
| 20 | 53 | \item $P(r_1 + r_2)$ holds under the induction hypotheses that | 
| 54 | $P(r_1)$ and $P(r_2)$ hold, | |
| 55 | \item $P(r_1 \cdot r_2)$ holds under the induction hypotheses that | |
| 56 | $P(r_1)$ and $P(r_2)$ hold, and | |
| 57 | \item $P(r^*)$ holds under the induction hypothesis that $P(r)$ holds. | |
| 58 | \end{enumerate}
 | |
| 59 | ||
| 60 | \noindent | |
| 61 | Let us try out an induction proof. Recall the definition | |
| 62 | ||
| 63 | \begin{center}
 | |
| 64 | $Der\, c\, A \dn \{ s\;\mid\; c\!::\!s \in A\}$
 | |
| 65 | \end{center}
 | |
| 66 | ||
| 67 | \noindent | |
| 68 | whereby $A$ is a set of strings. We like to prove | |
| 69 | ||
| 70 | \begin{center}
 | |
| 71 | \begin{tabular}{l}
 | |
| 72 | $P(r) \dn $ \hspace{4mm} $L(der\,c\,r) = Der\,c\,(L(r))$
 | |
| 73 | \end{tabular}
 | |
| 74 | \end{center}
 | |
| 75 | ||
| 76 | \noindent | |
| 77 | by induction over the regular expression $r$. | |
| 78 | ||
| 79 | ||
| 80 | \newpage | |
| 81 | \noindent | |
| 82 | {\bf Proof}
 | |
| 83 | ||
| 84 | \noindent | |
| 402 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 85 | According to 1.~above we need to prove $P(\ZERO)$, $P(\ONE)$ and $P(d)$. Lets do this in turn. | 
| 20 | 86 | |
| 87 | \begin{itemize}
 | |
| 402 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 88 | \item First Case: $P(\ZERO)$ is $L(der\,c\,\ZERO) = Der\,c\,(L(\ZERO))$ (a). We have $der\,c\,\ZERO = \ZERO$ | 
| 953 | 89 | and $L(\ZERO) = \emptyset$. We also have $Der\,c\,L(\ZERO) = L(\ZERO$). Hence we have $\emptyset = \emptyset$ in (a). | 
| 20 | 90 | |
| 402 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 91 | \item Second Case: $P(\ONE)$ is $L(der\,c\,\ONE) = Der\,c\,(L(\ONE))$ (b). We have $der\,c\,\ONE = \ZERO$, | 
| 953 | 92 | $L(\ZERO) = \ZERO$ and $L(\ONE) = \{\texttt{""}\}$. We also have $Der\,c\,\{\texttt{""}\} = \emptyset$. Hence we have 
 | 
| 93 | $\emptyset = \emptyset$ in (b). | |
| 20 | 94 | |
| 95 | \item Third Case: $P(d)$ is $L(der\,c\,d) = Der\,c\,(L(d))$ (c). We need to treat the cases $d = c$ and $d \not= c$. | |
| 96 | ||
| 402 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 97 | $d = c$: We have $der\,c\,c = \ONE$ and $L(\ONE) = \{\texttt{""}\}$. 
 | 
| 20 | 98 | We also have $L(c) = \{\texttt{"}c\texttt{"}\}$ and $Der\,c\,\{\texttt{"}c\texttt{"}\} = \{\texttt{""}\}$. Hence we have 
 | 
| 99 | $\{\texttt{""}\} = \{\texttt{""}\}$ in (c). 
 | |
| 100 | ||
| 402 
55f097ab96c9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
388diff
changeset | 101 | $d \not=c$: We have $der\,c\,d = \ZERO$. | 
| 953 | 102 | We also have $Der\,c\,\{\texttt{"}d\texttt{"}\} = \emptyset$. Hence we have 
 | 
| 103 | $\emptyset = \emptyset$ in (c). | |
| 34 | 104 | \end{itemize}
 | 
| 20 | 105 | |
| 34 | 106 | \noindent | 
| 107 | These were the easy base cases. Now come the inductive cases. | |
| 108 | ||
| 109 | \begin{itemize}
 | |
| 20 | 110 | \item Fourth Case: $P(r_1 + r_2)$ is $L(der\,c\,(r_1 + r_2)) = Der\,c\,(L(r_1 + r_2))$ (d). This is what we have to show. | 
| 111 | We can assume already: | |
| 112 | ||
| 113 | \begin{center}
 | |
| 114 | \begin{tabular}{ll}
 | |
| 115 | $P(r_1)$: & $L(der\,c\,r_1) = Der\,c\,(L(r_1))$ (I)\\ | |
| 116 | $P(r_2)$: & $L(der\,c\,r_2) = Der\,c\,(L(r_2))$ (II) | |
| 117 | \end{tabular}
 | |
| 118 | \end{center}
 | |
| 119 | ||
| 120 | We have that $der\,c\,(r_1 + r_2) = (der\,c\,r_1) + (der\,c\,r_2)$ and also $L((der\,c\,r_1) + (der\,c\,r_2)) = L(der\,c\,r_1) \cup L(der\,c\,r_2)$. | |
| 121 | By (I) and (II) we know that the left-hand side is $Der\,c\,(L(r_1)) \cup Der\,c\,(L(r_2))$. You need to ponder a bit, but you should see | |
| 122 | that | |
| 123 | ||
| 124 | \begin{center}
 | |
| 125 | $Der\,c(A \cup B) = (Der\,c\,A) \cup (Der\,c\,B)$ | |
| 126 | \end{center}
 | |
| 127 | ||
| 128 | holds for every set of strings $A$ and $B$. That means the right-hand side of (d) is also $Der\,c\,(L(r_1)) \cup Der\,c\,(L(r_2))$, | |
| 129 | because $L(r_1 + r_2) = L(r_1) \cup L(r_2)$. And we are done with the fourth case. | |
| 130 | ||
| 21 | 131 | \item Fifth Case: $P(r_1 \cdot r_2)$ is $L(der\,c\,(r_1 \cdot r_2)) = Der\,c\,(L(r_1 \cdot r_2))$ (e). We can assume already: | 
| 132 | ||
| 133 | \begin{center}
 | |
| 134 | \begin{tabular}{ll}
 | |
| 135 | $P(r_1)$: & $L(der\,c\,r_1) = Der\,c\,(L(r_1))$ (I)\\ | |
| 136 | $P(r_2)$: & $L(der\,c\,r_2) = Der\,c\,(L(r_2))$ (II) | |
| 137 | \end{tabular}
 | |
| 138 | \end{center}
 | |
| 139 | ||
| 140 | Let us first consider the case where $nullable(r_1)$ holds. Then | |
| 141 | ||
| 142 | \[ | |
| 143 | der\,c\,(r_1 \cdot r_2) = ((der\,c\,r_1) \cdot r_2) + (der\,c\,r_2). | |
| 144 | \] | |
| 145 | ||
| 146 | The corresponding language of the right-hand side is | |
| 147 | ||
| 148 | \[ | |
| 149 | (L(der\,c\,r_1) \,@\, L(r_2)) \cup L(der\,c\,r_2). | |
| 150 | \] | |
| 151 | ||
| 152 | By the induction hypotheses (I) and (II), this is equal to | |
| 153 | ||
| 154 | \[ | |
| 155 | (Der\,c\,(L(r_1)) \,@\, L(r_2)) \cup (Der\,c\,(L(r_2)).\;\;(**) | |
| 156 | \] | |
| 157 | ||
| 158 | We also know that $L(r_1 \cdot r_2) = L(r_1) \,@\,L(r_2)$. We have to know what | |
| 159 | $Der\,c\,(L(r_1) \,@\,L(r_2))$ is. | |
| 160 | ||
| 161 | Let us analyse what | |
| 162 | $Der\,c\,(A \,@\, B)$ is for arbitrary sets of strings $A$ and $B$. If $A$ does \emph{not}
 | |
| 163 | contain the empty string, then every string in $A\,@\,B$ is of the form $s_1 \,@\, s_2$ where | |
| 164 | $s_1 \in A$ and $s_2 \in B$. So if $s_1$ starts with $c$ then we just have to remove it. Consequently, | |
| 165 | $Der\,c\,(A \,@\, B) = (Der\,c\,(A)) \,@\, B$. This case does not apply here though, because we already | |
| 166 | proved that if $r_1$ is nullable, then $L(r_1)$ contains the empty string. In this case, every string | |
| 167 | in $A\,@\,B$ is either of the form $s_1 \,@\, s_2$, with $s_1 \in A$ and $s_2 \in B$, or | |
| 168 | $s_3$ with $s_3 \in B$. This means $Der\,c\,(A \,@\, B) = ((Der\,c\,(A)) \,@\, B) \cup Der\,c\,B$. | |
| 169 | But this proves that (**) is $Der\,c\,(L(r_1) \,@\, L(r_2))$. | |
| 170 | ||
| 171 | Similarly in the case where $r_1$ is \emph{not} nullable.
 | |
| 20 | 172 | |
| 34 | 173 | \item Sixth Case: $P(r^*)$ is $L(der\,c\,(r^*)) = Der\,c\,L(r^*)$. We can assume already: | 
| 174 | ||
| 175 | \begin{center}
 | |
| 176 | \begin{tabular}{ll}
 | |
| 177 | $P(r)$: & $L(der\,c\,r) = Der\,c\,(L(r))$ (I) | |
| 178 | \end{tabular}
 | |
| 179 | \end{center}
 | |
| 180 | ||
| 181 | We have $der\,c\,(r^*) = der\,c\,r\cdot r^*$. Which means $L(der\,c\,(r^*)) = L(der\,c\,r\cdot r^*)$ and | |
| 182 | further $L(der\,c\,r) \,@\, L(r^*)$. By induction hypothesis (I) we know that is equal to | |
| 183 | $(Der\,c\,L(r)) \,@\, L(r^*)$. (*) | |
| 184 | ||
| 20 | 185 | \end{itemize}
 | 
| 186 | ||
| 34 | 187 | Let us now analyse $Der\,c\,L(r^*)$, which is equal to $Der\,c\,((L(r))^*)$. Now $(L(r))^*$ is defined | 
| 388 
66f66f1710ed
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
102diff
changeset | 188 | as $\bigcup_{n \ge 0} L(r)^n$. We can write this as $L(r)^0 \cup \bigcup_{n \ge 1} L(r)^n$, where we just 
 | 
| 34 | 189 | separated the first union and then let the ``big-union'' start from $1$. Form this we can already infer | 
| 190 | ||
| 191 | \begin{center}
 | |
| 388 
66f66f1710ed
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
102diff
changeset | 192 | $Der\,c\,(L(r^*)) = Der\,c\,(L(r)^0 \cup \bigcup_{n \ge 1} L(r)^n) = (Der\,c\,L(r)^0) \cup Der\,c\,(\bigcup_{n \ge 1} L(r)^n)$
 | 
| 34 | 193 | \end{center}
 | 
| 194 | ||
| 953 | 195 | The first union ``disappears'' since $Der\,c\,(L(r)^0) = \emptyset$. | 
| 34 | 196 | |
| 197 | ||
| 20 | 198 | \end{document}
 | 
| 199 | ||
| 200 | %%% Local Variables: | |
| 201 | %%% mode: latex | |
| 202 | %%% TeX-master: t | |
| 203 | %%% End: |