55 \end{center} |
55 \end{center} |
56 The union of all the natural number powers of a language |
56 The union of all the natural number powers of a language |
57 is denoted by the Kleene star operator: |
57 is denoted by the Kleene star operator: |
58 \begin{center} |
58 \begin{center} |
59 \begin{tabular}{lcl} |
59 \begin{tabular}{lcl} |
60 $\bigcup_{i \geq 0} A^i$ & $\denote$ & $A^*$\\ |
60 $A*$ & $\dn$ & $\bigcup_{i \geq 0} A^i$ \\ |
61 \end{tabular} |
61 \end{tabular} |
62 \end{center} |
62 \end{center} |
63 |
63 |
64 In Isabelle of course we cannot easily get a counterpart of |
64 To get an induction principle in Isabelle/HOL, |
65 the $\bigcup_{i \geq 0}$ operator, so we instead define the Kleene star |
65 we instead define the Kleene star |
66 as an inductive set: |
66 as an inductive set: |
67 \begin{center} |
67 \begin{center} |
68 \begin{tabular}{lcl} |
68 \begin{tabular}{lcl} |
69 $[] \in A^*$ & &\\ |
69 $[] \in A*$ & &\\ |
70 $s_1 \in A \land \; s_2 \in A^* $ & $\implies$ & $s_1 @ s_2 \in A^*$\\ |
70 $s_1 \in A \land \; s_2 \in A* $ & $\implies$ & $s_1 @ s_2 \in A*$\\ |
71 \end{tabular} |
71 \end{tabular} |
72 \end{center} |
72 \end{center} |
73 \subsection{Language Derivatives} |
73 \subsection{Language Derivatives} |
74 We also define an operation of chopping off a character from |
74 We also define an operation of chopping off a character from |
75 a language: |
75 a language: |
87 \end{center} |
87 \end{center} |
88 |
88 |
89 which is essentially the left quotient $A \backslash L'$ of $A$ against |
89 which is essentially the left quotient $A \backslash L'$ of $A$ against |
90 the singleton language $L' = \{w\}$ |
90 the singleton language $L' = \{w\}$ |
91 in formal language theory. |
91 in formal language theory. |
92 For this dissertation the $\textit{Ders}$ notation would suffice, there is |
92 For this dissertation the $\textit{Ders}$ definition with |
93 no need for a more general derivative definition. |
93 a single string suffices. |
94 |
94 |
95 With the sequencing, Kleene star, and $\textit{Der}$ operator on languages, |
95 With the sequencing, Kleene star, and $\textit{Der}$ operator on languages, |
96 we have a few properties of how the language derivative can be defined using |
96 we have a few properties of how the language derivative can be defined using |
97 sub-languages. |
97 sub-languages. |
98 \begin{lemma} |
98 \begin{lemma} |
99 $\Der \; c \; (A @ B) = \textit{if} \; [] \in A \; \textit{then} ((\Der \; c \; A) @ B ) \cup \Der \; c\; B \quad \textit{else}\; (\Der \; c \; A) @ B$ |
99 \[ |
|
100 \Der \; c \; (A @ B) = |
|
101 \begin{cases} |
|
102 ((\Der \; c \; A) @ B ) \cup \Der \; c\; B, & \text{if} \; [] \in A \\ |
|
103 (\Der \; c \; A) @ B, & \text{otherwise} |
|
104 \end{cases} |
|
105 \] |
100 \end{lemma} |
106 \end{lemma} |
101 \noindent |
107 \noindent |
102 This lemma states that if $A$ contains the empty string, $\Der$ can "pierce" through it |
108 This lemma states that if $A$ contains the empty string, $\Der$ can "pierce" through it |
103 and get to $B$. |
109 and get to $B$. |
104 |
110 |
105 The language $A^*$'s derivative can be described using the language derivative |
111 The language $A*$'s derivative can be described using the language derivative |
106 of $A$: |
112 of $A$: |
107 \begin{lemma} |
113 \begin{lemma} |
108 $\textit{Der} \;c \;A^* = (\textit{Der}\; c A) @ (A^*)$\\ |
114 $\textit{Der} \;c \;A* = (\textit{Der}\; c A) @ (A*)$\\ |
109 \end{lemma} |
115 \end{lemma} |
110 \begin{proof} |
116 \begin{proof} |
111 \begin{itemize} |
117 \begin{itemize} |
112 \item{$\subseteq$} |
118 \item{$\subseteq$} |
113 The set |
119 The set |
114 \[ \{s \mid c :: s \in A^*\} \] |
120 \[ \{s \mid c :: s \in A*\} \] |
115 is enclosed in the set |
121 is enclosed in the set |
116 \[ \{s_1 @ s_2 \mid s_1 \, s_2. s_1 \in \{s \mid c :: s \in A\} \land s_2 \in A^* \} \] |
122 \[ \{s_1 @ s_2 \mid s_1 \, s_2. s_1 \in \{s \mid c :: s \in A\} \land s_2 \in A* \} \] |
117 because whenever you have a string starting with a character |
123 because whenever you have a string starting with a character |
118 in the language of a Kleene star $A^*$, then that character together with some sub-string |
124 in the language of a Kleene star $A*$, then that character together with some sub-string |
119 immediately after it will form the first iteration, and the rest of the string will |
125 immediately after it will form the first iteration, and the rest of the string will |
120 be still in $A^*$. |
126 be still in $A*$. |
121 \item{$\supseteq$} |
127 \item{$\supseteq$} |
122 Note that |
128 Note that |
123 \[ \Der \; c \; A^* = \Der \; c \; (\{ [] \} \cup (A @ A^*) ) \] |
129 \[ \Der \; c \; A* = \Der \; c \; (\{ [] \} \cup (A @ A*) ) \] |
124 and |
130 and |
125 \[ \Der \; c \; (\{ [] \} \cup (A @ A^*) ) = \Der\; c \; (A @ A^*) \] |
131 \[ \Der \; c \; (\{ [] \} \cup (A @ A*) ) = \Der\; c \; (A @ A*) \] |
126 where the $\textit{RHS}$ of the above equatioin can be rewritten |
132 where the $\textit{RHS}$ of the above equatioin can be rewritten |
127 as \[ (\Der \; c\; A) @ A^* \cup A' \], $A'$ being a possibly empty set. |
133 as \[ (\Der \; c\; A) @ A* \cup A' \], $A'$ being a possibly empty set. |
128 \end{itemize} |
134 \end{itemize} |
129 \end{proof} |
135 \end{proof} |
130 Before we define the $\textit{Der}$ and $\textit{Ders}$ counterpart |
136 Before we define the $\textit{Der}$ and $\textit{Ders}$ counterpart |
131 for regular languages, we need to first give definitions for regular expressions. |
137 for regular languages, we need to first give definitions for regular expressions. |
132 |
138 |
133 \subsection{Regular Expressions and Their Language Interpretation} |
139 \subsection{Regular Expressions and Their Meaning} |
134 Suppose we have an alphabet $\Sigma$, the strings whose characters |
140 Suppose we have an alphabet $\Sigma$, the strings whose characters |
135 are from $\Sigma$ |
141 are from $\Sigma$ |
136 can be expressed as $\Sigma^*$. |
142 can be expressed as $\Sigma^*$. |
137 |
143 |
138 We use patterns to define a set of strings concisely. Regular expressions |
144 We use patterns to define a set of strings concisely. Regular expressions |
149 The language or set of strings denoted |
155 The language or set of strings denoted |
150 by regular expressions are defined as |
156 by regular expressions are defined as |
151 %TODO: FILL in the other defs |
157 %TODO: FILL in the other defs |
152 \begin{center} |
158 \begin{center} |
153 \begin{tabular}{lcl} |
159 \begin{tabular}{lcl} |
|
160 $L \; (\ZERO)$ & $\dn$ & $\phi$\\ |
|
161 $L \; (\ONE)$ & $\dn$ & $\{[]\}$\\ |
|
162 $L \; (c)$ & $\dn$ & $\{[c]\}$\\ |
154 $L \; (r_1 + r_2)$ & $\dn$ & $ L \; (r_1) \cup L \; ( r_2)$\\ |
163 $L \; (r_1 + r_2)$ & $\dn$ & $ L \; (r_1) \cup L \; ( r_2)$\\ |
155 $L \; (r_1 \cdot r_2)$ & $\dn$ & $ L \; (r_1) \cap L \; (r_2)$\\ |
164 $L \; (r_1 \cdot r_2)$ & $\dn$ & $ L \; (r_1) \cap L \; (r_2)$\\ |
156 \end{tabular} |
165 $L \; (r^*)$ & $\dn$ & $ (L(r))^*$ |
157 \end{center} |
166 \end{tabular} |
|
167 \end{center} |
|
168 \noindent |
158 Which is also called the "language interpretation" of |
169 Which is also called the "language interpretation" of |
159 a regular expression. |
170 a regular expression. |
160 |
171 |
161 Now with semantic derivatives of a language and regular expressions and |
172 Now with semantic derivatives of a language and regular expressions and |
162 their language interpretations in place, we are ready to define derivatives on regexes. |
173 their language interpretations in place, we are ready to define derivatives on regexes. |
163 \subsection{Brzozowski Derivatives and a Regular Expression Matcher} |
174 \subsection{Brzozowski Derivatives and a Regular Expression Matcher} |
164 The language derivatives acts on a string set and chops off a character from |
175 The language derivative acts on a string set and chops off a character from |
165 all strings in that set, we want to define a derivative operation on regular expressions |
176 all strings in that set, we want to define a derivative operation on regular expressions |
166 so that after derivative $L(r\backslash c)$ |
177 so that after derivative $L(r\backslash c)$ |
167 will look as if it was obtained by doing a language derivative on $L(r)$: |
178 will look as if it was obtained by doing a language derivative on $L(r)$: |
168 \[ |
179 \[ |
169 L(r \backslash c) = \Der \; c \; L(r) |
180 L(r \backslash c) = \Der \; c \; L(r) |