141 Regular languages are an important and well-understood subject in Computer |
141 Regular languages are an important and well-understood subject in Computer |
142 Science, with many beautiful theorems and many useful algorithms. There is a |
142 Science, with many beautiful theorems and many useful algorithms. There is a |
143 wide range of textbooks on this subject, many of which are aimed at students |
143 wide range of textbooks on this subject, many of which are aimed at students |
144 and contain very detailed `pencil-and-paper' proofs (e.g.~\cite{Kozen97, |
144 and contain very detailed `pencil-and-paper' proofs (e.g.~\cite{Kozen97, |
145 HopcroftUllman69}). It seems natural to exercise theorem provers by |
145 HopcroftUllman69}). It seems natural to exercise theorem provers by |
146 formalising the theorems and by verifying formally the algorithms. A |
146 formalising the theorems and by verifying formally the algorithms. |
147 popular choice for a theorem prover would be one based on Higher-Order Logic |
147 |
148 (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development |
148 A popular choice for a theorem prover would be one based on Higher-Order |
|
149 Logic (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development |
149 presented in this paper we will use the latter. HOL is a predicate calculus |
150 presented in this paper we will use the latter. HOL is a predicate calculus |
150 that allows quantification over predicate variables. Its type system is |
151 that allows quantification over predicate variables. Its type system is |
151 based on Church's Simple Theory of Types \cite{Church40}. Although |
152 based on Church's Simple Theory of Types \cite{Church40}. Although many |
152 many mathematical concepts can be conveniently expressed in HOL, there are some |
153 mathematical concepts can be conveniently expressed in HOL, there are some |
153 limitations that hurt badly, if one attempts a simple-minded formalisation |
154 limitations that hurt badly, if one attempts a simple-minded formalisation |
154 of regular languages in it. |
155 of regular languages in it. The typical approach to regular languages is to |
155 |
156 introduce finite automata and then define everything in terms of them |
156 The typical approach to regular languages is to introduce finite automata |
157 \cite{Kozen97}. For example, a regular language is normally defined as: |
157 and then define everything in terms of them \cite{Kozen97}. For example, |
|
158 a regular language is normally defined as: |
|
159 |
158 |
160 \begin{dfntn}\label{baddef} |
159 \begin{dfntn}\label{baddef} |
161 A language @{text A} is \emph{regular}, provided there is a |
160 A language @{text A} is \emph{regular}, provided there is a |
162 finite deterministic automaton that recognises all strings of @{text "A"}. |
161 finite deterministic automaton that recognises all strings of @{text "A"}. |
163 \end{dfntn} |
162 \end{dfntn} |
250 pairs. Using this definition for disjoint union means we do not have a |
249 pairs. Using this definition for disjoint union means we do not have a |
251 single type for automata. As a result we will not be able to define a regular |
250 single type for automata. As a result we will not be able to define a regular |
252 language as one for which there exists an automaton that recognises all its |
251 language as one for which there exists an automaton that recognises all its |
253 strings. This is because we cannot make a definition in HOL that is polymorphic in |
252 strings. This is because we cannot make a definition in HOL that is polymorphic in |
254 the state type and there is no type quantification available in HOL (unlike |
253 the state type and there is no type quantification available in HOL (unlike |
255 in Coq, for example). |
254 in Coq, for example).\footnote{Slind already pointed out this problem in an email |
|
255 to the HOL4 mailing list on 21st April 2005.} |
256 |
256 |
257 An alternative, which provides us with a single type for automata, is to give every |
257 An alternative, which provides us with a single type for automata, is to give every |
258 state node an identity, for example a natural |
258 state node an identity, for example a natural |
259 number, and then be careful to rename these identities apart whenever |
259 number, and then be careful to rename these identities apart whenever |
260 connecting two automata. This results in clunky proofs |
260 connecting two automata. This results in clunky proofs |
264 |
264 |
265 Functions are much better supported in Isabelle/HOL, but they still lead to similar |
265 Functions are much better supported in Isabelle/HOL, but they still lead to similar |
266 problems as with graphs. Composing, for example, two non-deterministic automata in parallel |
266 problems as with graphs. Composing, for example, two non-deterministic automata in parallel |
267 requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98} |
267 requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98} |
268 dismisses for this the option of using identities, because it leads according to |
268 dismisses for this the option of using identities, because it leads according to |
269 him to ``messy proofs''. Since he does not need to define what a regular |
269 him to ``messy proofs''. Since he does not need to define what regular |
270 language is, Nipkow opts for a variant of \eqref{disjointunion} using bit lists, but writes |
270 languages are, Nipkow opts for a variant of \eqref{disjointunion} using bit lists, but writes |
271 |
271 |
272 \begin{quote} |
272 \begin{quote} |
273 \it% |
273 \it% |
274 \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}} |
274 \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}} |
275 `` & All lemmas appear obvious given a picture of the composition of automata\ldots |
275 `` & All lemmas appear obvious given a picture of the composition of automata\ldots |
315 `sink' state from which there is no connection to a final state (Brzozowski |
315 `sink' state from which there is no connection to a final state (Brzozowski |
316 mentions this side-condition in the context of state complexity |
316 mentions this side-condition in the context of state complexity |
317 of automata \cite{Brzozowski10}). Such side-conditions mean that if we define a regular |
317 of automata \cite{Brzozowski10}). Such side-conditions mean that if we define a regular |
318 language as one for which there exists \emph{a} finite automaton that |
318 language as one for which there exists \emph{a} finite automaton that |
319 recognises all its strings (see Def.~\ref{baddef}), then we need a lemma which |
319 recognises all its strings (see Def.~\ref{baddef}), then we need a lemma which |
320 ensures that another equivalent can be found satisfying the |
320 ensures that another equivalent one can be found satisfying the |
321 side-condition. Unfortunately, such `little' and `obvious' lemmas make |
321 side-condition. Unfortunately, such `little' and `obvious' lemmas make |
322 a formalisation of automata theory a hair-pulling experience. |
322 a formalisation of automata theory a hair-pulling experience. |
323 |
323 |
324 |
324 |
325 In this paper, we will not attempt to formalise automata theory in |
325 In this paper, we will not attempt to formalise automata theory in |
358 an argument about solving equational systems. This argument appears to be |
358 an argument about solving equational systems. This argument appears to be |
359 folklore. For the other part, we give two proofs: one direct proof using |
359 folklore. For the other part, we give two proofs: one direct proof using |
360 certain tagging-functions, and another indirect proof using Antimirov's |
360 certain tagging-functions, and another indirect proof using Antimirov's |
361 partial derivatives \cite{Antimirov95}. Again to our best knowledge, the |
361 partial derivatives \cite{Antimirov95}. Again to our best knowledge, the |
362 tagging-functions have not been used before to establish the Myhill-Nerode |
362 tagging-functions have not been used before to establish the Myhill-Nerode |
363 theorem. Derivatives of regular expressions have been recently used quite |
363 theorem. Derivatives of regular expressions have been used recently quite |
364 widely in the literature about regular expressions. However, partial |
364 widely in the literature; partial derivatives, in contrast, attracted much |
365 derivatives are more suitable in the context of the Myhill-Nerode theorem, |
365 less attention. However, partial derivatives are more suitable in the |
366 since it is easier to establish formally their finiteness result. |
366 context of the Myhill-Nerode theorem, since it is easier to establish |
367 |
367 formally their finiteness result. We have not found any proof that uses |
|
368 either of them in order to prove the Myhill-Nerode theorem. |
368 *} |
369 *} |
369 |
370 |
370 section {* Preliminaries *} |
371 section {* Preliminaries *} |
371 |
372 |
372 text {* |
373 text {* |
396 where @{text "@"} is the list-append operation. The Kleene-star of a language @{text A} |
397 where @{text "@"} is the list-append operation. The Kleene-star of a language @{text A} |
397 is defined as the union over all powers, namely @{thm star_def}. In the paper |
398 is defined as the union over all powers, namely @{thm star_def}. In the paper |
398 we will make use of the following properties of these constructions. |
399 we will make use of the following properties of these constructions. |
399 |
400 |
400 \begin{prpstn}\label{langprops}\mbox{}\\ |
401 \begin{prpstn}\label{langprops}\mbox{}\\ |
401 \begin{tabular}{@ {}ll} |
402 \begin{tabular}{@ {}lp{10cm}} |
402 (i) & @{thm star_unfold_left} \\ |
403 (i) & @{thm star_unfold_left} \\ |
403 (ii) & @{thm[mode=IfThen] pow_length}\\ |
404 (ii) & @{thm[mode=IfThen] pow_length}\\ |
404 (iii) & @{thm conc_Union_left} \\ |
405 (iii) & @{thm conc_Union_left} \\ |
|
406 (iv) & If @{thm (prem 1) star_decom} and @{thm (prem 2) star_decom} then |
|
407 there exists an @{text "x\<^isub>p"} and @{text "x\<^isub>s"} with @{text "x = x\<^isub>p @ x\<^isub>s"} |
|
408 and @{term "x\<^isub>p \<noteq> []"} such that @{term "x\<^isub>p \<in> A"} and @{term "x\<^isub>s \<in> A\<star>"}. |
405 \end{tabular} |
409 \end{tabular} |
406 \end{prpstn} |
410 \end{prpstn} |
407 |
411 |
408 \noindent |
412 \noindent |
409 In @{text "(ii)"} we use the notation @{term "length s"} for the length of a |
413 In @{text "(ii)"} we use the notation @{term "length s"} for the length of a |
1247 \noindent |
1251 \noindent |
1248 where @{text "A"} and @{text "B"} are some arbitrary languages. The reason for this choice |
1252 where @{text "A"} and @{text "B"} are some arbitrary languages. The reason for this choice |
1249 is that we need to establish that @{term "=(tag_Plus A B)="} refines @{term "\<approx>(A \<union> B)"}. |
1253 is that we need to establish that @{term "=(tag_Plus A B)="} refines @{term "\<approx>(A \<union> B)"}. |
1250 This amounts to showing @{term "x \<approx>A y"} or @{term "x \<approx>B y"} under the assumption |
1254 This amounts to showing @{term "x \<approx>A y"} or @{term "x \<approx>B y"} under the assumption |
1251 @{term "x"}~@{term "=(tag_Plus A B)="}~@{term y}. As we shall see, this definition will |
1255 @{term "x"}~@{term "=(tag_Plus A B)="}~@{term y}. As we shall see, this definition will |
1252 provide us just the right assumptions in order to get the proof through. |
1256 provide us with just the right assumptions in order to get the proof through. |
1253 |
1257 |
1254 \begin{proof}[@{const "PLUS"}-Case] |
1258 \begin{proof}[@{const "PLUS"}-Case] |
1255 We can show in general, if @{term "finite (UNIV // \<approx>A)"} and @{term "finite |
1259 We can show in general, if @{term "finite (UNIV // \<approx>A)"} and @{term "finite |
1256 (UNIV // \<approx>B)"} then @{term "finite ((UNIV // \<approx>A) \<times> (UNIV // \<approx>B))"} |
1260 (UNIV // \<approx>B)"} then @{term "finite ((UNIV // \<approx>A) \<times> (UNIV // \<approx>B))"} |
1257 holds. The range of @{term "tag_Plus A B"} is a subset of this product |
1261 holds. The range of @{term "tag_Plus A B"} is a subset of this product |
1262 "lang r\<^isub>1"} and @{text B} to @{term "lang r\<^isub>2"}. |
1266 "lang r\<^isub>1"} and @{text B} to @{term "lang r\<^isub>2"}. |
1263 \end{proof} |
1267 \end{proof} |
1264 |
1268 |
1265 \noindent |
1269 \noindent |
1266 The @{const TIMES}-case is slightly more complicated. We first prove the |
1270 The @{const TIMES}-case is slightly more complicated. We first prove the |
1267 following lemma, which will aid the refinement-proofs. |
1271 following lemma, which will aid the proof about refinement. |
1268 |
1272 |
1269 \begin{lmm}\label{refinement} |
1273 \begin{lmm}\label{refinement} |
1270 The relation @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} refines @{term "\<approx>A"}, provided for |
1274 The relation @{text "\<^raw:$\threesim$>\<^bsub>tag\<^esub>"} refines @{term "\<approx>A"}, provided for |
1271 all strings @{text x}, @{text y} and @{text z} we have \mbox{@{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y"}} |
1275 all strings @{text x}, @{text y} and @{text z} we have \mbox{@{text "x \<^raw:$\threesim$>\<^bsub>tag\<^esub> y"}} |
1272 and @{term "x @ z \<in> A"} imply @{text "y @ z \<in> A"}. |
1276 and @{term "x @ z \<in> A"} imply @{text "y @ z \<in> A"}. |
1273 \end{lmm} |
1277 \end{lmm} |
1274 |
1278 |
1275 |
1279 |
1276 \noindent |
1280 \noindent |
1277 We therefore can clean information from how the strings @{text "x @ z"} are in @{text A} |
1281 We therefore can analyse how the strings @{text "x @ z"} are in the language |
1278 and construct appropriate tagging-functions to infer that @{term "y @ z \<in> A"}. |
1282 @{text A} and then construct an appropriate tagging-function to infer that |
1279 For the @{const Times}-case we additionally need the notion of the set of all |
1283 @{term "y @ z"} are also in @{text A}. For this we sill need the notion of |
1280 possible partitions of a string |
1284 the set of all possible \emph{partitions} of a string |
1281 |
1285 |
1282 \begin{equation} |
1286 \begin{equation} |
1283 @{thm Partitions_def} |
1287 @{thm Partitions_def} |
1284 \end{equation} |
1288 \end{equation} |
|
1289 |
|
1290 \noindent |
|
1291 If we know that @{text "(x\<^isub>p, x\<^isub>s) \<in> Partitions x"}, we will |
|
1292 refer to @{text "x\<^isub>p"} as the \emph{prefix} of the string @{text x}, |
|
1293 respectively to @{text "x\<^isub>s"} as the \emph{suffix}. |
|
1294 |
1285 |
1295 |
1286 Now assuming @{term "x @ z \<in> A \<cdot> B"} there are only two possible ways of how to `split' |
1296 Now assuming @{term "x @ z \<in> A \<cdot> B"} there are only two possible ways of how to `split' |
1287 this string to be in @{term "A \<cdot> B"}: |
1297 this string to be in @{term "A \<cdot> B"}: |
1288 % |
1298 % |
1289 \begin{center} |
1299 \begin{center} |
1348 Either @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} |
1358 Either @{text x} and a prefix of @{text "z"} is in @{text A} and the rest in @{text B} |
1349 (first picture) or there is a prefix of @{text x} in @{text A} and the rest is in @{text B} |
1359 (first picture) or there is a prefix of @{text x} in @{text A} and the rest is in @{text B} |
1350 (second picture). In both cases we have to show that @{term "y @ z \<in> A \<cdot> B"}. The first case |
1360 (second picture). In both cases we have to show that @{term "y @ z \<in> A \<cdot> B"}. The first case |
1351 we will only go through if we know that @{term "x \<approx>A y"} holds @{text "(*)"}. Because then |
1361 we will only go through if we know that @{term "x \<approx>A y"} holds @{text "(*)"}. Because then |
1352 we can infer from @{term "x @ z\<^isub>p \<in> A"} that @{term "y @ z\<^isub>p \<in> A"} holds for all @{text "z\<^isub>p"}. |
1362 we can infer from @{term "x @ z\<^isub>p \<in> A"} that @{term "y @ z\<^isub>p \<in> A"} holds for all @{text "z\<^isub>p"}. |
1353 In the second case we only know that @{text "x\<^isub>p"} and @{text "x\<^isub>s"} is a possible partition |
1363 In the second case we only know that @{text "x\<^isub>p"} and @{text "x\<^isub>s"} is one possible partition |
1354 of the string @{text x}. So we have to know that both @{text "x\<^isub>p"} and the |
1364 of the string @{text x}. We have to know that both @{text "x\<^isub>p"} and the |
1355 corresponding partition @{text "y\<^isub>p"} are in @{text "A"}, and that @{text "x\<^isub>s"} is `@{text B}-related' |
1365 corresponding partition @{text "y\<^isub>p"} are in @{text "A"}, and that @{text "x\<^isub>s"} is `@{text B}-related' |
1356 to @{text "y\<^isub>s"} @{text "(**)"}. From the latter fact we can infer that @{text "y\<^isub>s @ z \<in> B"}. |
1366 to @{text "y\<^isub>s"} @{text "(**)"}. From the latter fact we can infer that @{text "y\<^isub>s @ z \<in> B"}. |
|
1367 This will solve the second case. |
1357 Taking the two requirements, @{text "(*)"} and @{text "(**)"}, together we define the |
1368 Taking the two requirements, @{text "(*)"} and @{text "(**)"}, together we define the |
1358 tagging-function as: |
1369 tagging-function in the @{const Times}-case as: |
1359 |
1370 |
1360 \begin{center} |
1371 \begin{center} |
1361 @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\<equiv>"}~ |
1372 @{thm (lhs) tag_Times_def[where ?A="A" and ?B="B"]}~@{text "\<equiv>"}~ |
1362 @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"} |
1373 @{text "(\<lbrakk>x\<rbrakk>\<^bsub>\<approx>A\<^esub>, {\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>B\<^esub> | x\<^isub>p \<in> A \<and> (x\<^isub>p, x\<^isub>s) \<in> Partitions x})"} |
1363 \end{center} |
1374 \end{center} |
1364 |
1375 |
1365 \noindent |
1376 \noindent |
1366 With this definition in place, we can discharge the @{const "Times"}-case as follows. |
1377 We have to make the assumption for all suffixes @{text "x\<^isub>s"}, since we do |
|
1378 not know anything about how the string @{term x} is partitioned. |
|
1379 With this definition in place, let us prove the @{const "Times"}-case. |
1367 |
1380 |
1368 |
1381 |
1369 \begin{proof}[@{const TIMES}-Case] |
1382 \begin{proof}[@{const TIMES}-Case] |
1370 If @{term "finite (UNIV // \<approx>A)"} and @{term "finite (UNIV // \<approx>B)"} |
1383 If @{term "finite (UNIV // \<approx>A)"} and @{term "finite (UNIV // \<approx>B)"} |
1371 then @{term "finite ((UNIV // \<approx>A) \<times> (Pow (UNIV // \<approx>B)))"} holds. The range of |
1384 then @{term "finite ((UNIV // \<approx>A) \<times> (Pow (UNIV // \<approx>B)))"} holds. The range of |
1372 @{term "tag_Times A B"} is a subset of this product set, and therefore finite. |
1385 @{term "tag_Times A B"} is a subset of this product set, and therefore finite. |
1373 By Lemma \ref{refinement} we know |
1386 For the refinement of @{term "\<approx>(A \<cdot> B)"} and @{text "\<^raw:$\threesim$>\<^bsub>\<times>tag A B\<^esub>"}, |
|
1387 we have by Lemma \ref{refinement} |
1374 |
1388 |
1375 \begin{center} |
1389 \begin{center} |
1376 @{term "tag_Times A B x = tag_Times A B y"} |
1390 @{term "tag_Times A B x = tag_Times A B y"} |
1377 \end{center} |
1391 \end{center} |
1378 |
1392 |
1379 \noindent |
1393 \noindent |
1380 and @{term "x @ z \<in> A \<cdot> B"}, and have to establish @{term "y @ z \<in> A \<cdot> B"}. As shown |
1394 and @{term "x @ z \<in> A \<cdot> B"}, and have to establish @{term "y @ z \<in> A \<cdot> |
1381 above, there are two cases to be considered (see pictures above). First, |
1395 B"}. As shown in the pictures above, there are two cases to be |
1382 there exists a @{text "z\<^isub>p"} and @{text "z\<^isub>s"} such that @{term "x @ z\<^isub>p \<in> A"} and @{text "z\<^isub>s \<in> B"}. |
1396 considered. First, there exists a @{text "z\<^isub>p"} and @{text |
1383 By the assumption about @{term "tag_Times A B"} we have |
1397 "z\<^isub>s"} such that @{term "x @ z\<^isub>p \<in> A"} and @{text "z\<^isub>s |
1384 @{term "\<approx>A `` {x} = \<approx>A `` {y}"} and thus @{term "x \<approx>A y"}. Which means by the Myhill-Nerode |
1398 \<in> B"}. By the assumption about @{term "tag_Times A B"} we have @{term "\<approx>A |
1385 relation that @{term "y @ z\<^isub>p \<in> A"} holds. Using @{text "z\<^isub>s \<in> B"}, we can conclude in this case |
1399 `` {x} = \<approx>A `` {y}"} and thus @{term "x \<approx>A y"}. Hence by the Myhill-Nerode |
1386 with @{term "y @ z \<in> A \<cdot> B"}. |
1400 relation @{term "y @ z\<^isub>p \<in> A"} holds. Using @{text "z\<^isub>s \<in> B"}, |
|
1401 we can conclude in this case with @{term "y @ z \<in> A \<cdot> B"} (recall @{text |
|
1402 "z\<^isub>p @ z\<^isub>s = z"}). |
1387 |
1403 |
1388 Second there exists a partition @{text "x\<^isub>p"} and @{text "x\<^isub>s"} with |
1404 Second there exists a partition @{text "x\<^isub>p"} and @{text "x\<^isub>s"} with |
1389 @{text "x\<^isub>p \<in> A"} and @{text "x\<^isub>s @ z \<in> B"}. We therefore have |
1405 @{text "x\<^isub>p \<in> A"} and @{text "x\<^isub>s @ z \<in> B"}. We therefore have |
1390 |
1406 |
1391 \begin{center} |
1407 \begin{center} |
1401 |
1417 |
1402 \noindent |
1418 \noindent |
1403 This means there must be a partition @{text "y\<^isub>p"} and @{text "y\<^isub>s"} |
1419 This means there must be a partition @{text "y\<^isub>p"} and @{text "y\<^isub>s"} |
1404 such that @{term "y\<^isub>p \<in> A"} and @{term "\<approx>B `` {x\<^isub>s} = \<approx>B `` |
1420 such that @{term "y\<^isub>p \<in> A"} and @{term "\<approx>B `` {x\<^isub>s} = \<approx>B `` |
1405 {y\<^isub>s}"}. Unfolding the Myhill-Nerode relation and together with the |
1421 {y\<^isub>s}"}. Unfolding the Myhill-Nerode relation and together with the |
1406 facts that @{text "x\<^isub>p \<in> A"} and @{text "x\<^isub>s @ z \<in> B"}, we |
1422 facts that @{text "x\<^isub>p \<in> A"} and \mbox{@{text "x\<^isub>s @ z \<in> B"}}, we |
1407 obtain @{term "y\<^isub>p \<in> A"} and @{text "y\<^isub>s @ z \<in> B"}, as needed in |
1423 obtain @{term "y\<^isub>p \<in> A"} and @{text "y\<^isub>s @ z \<in> B"}, as needed in |
1408 this case. We again can complete the @{const TIMES}-case by setting @{text |
1424 this case. We again can complete the @{const TIMES}-case by setting @{text |
1409 A} to @{term "lang r\<^isub>1"} and @{text B} to @{term "lang r\<^isub>2"}. |
1425 A} to @{term "lang r\<^isub>1"} and @{text B} to @{term "lang r\<^isub>2"}. |
1410 \end{proof} |
1426 \end{proof} |
1411 |
1427 |
1412 \noindent |
1428 \noindent |
1413 The case for @{const Star} is similar to @{const TIMES}, but poses a few |
1429 The case for @{const Star} is similar to @{const TIMES}, but poses a few |
1414 extra challenges. To deal with them we define first the notion of a \emph{string |
1430 extra challenges. To deal with them, we define first the notion of a \emph{string |
1415 prefix} and a \emph{strict string prefix}: |
1431 prefix} and a \emph{strict string prefix}: |
1416 |
1432 |
1417 \begin{center} |
1433 \begin{center} |
1418 \begin{tabular}{l} |
1434 \begin{tabular}{l} |
1419 @{text "x \<le> y \<equiv> \<exists>z. y = x @ z"}\\ |
1435 @{text "x \<le> y \<equiv> \<exists>z. y = x @ z"}\\ |
1420 @{text "x < y \<equiv> x \<le> y \<and> x \<noteq> y"} |
1436 @{text "x < y \<equiv> x \<le> y \<and> x \<noteq> y"} |
1421 \end{tabular} |
1437 \end{tabular} |
1422 \end{center} |
1438 \end{center} |
1423 |
1439 |
1424 \noindent |
1440 When analysing the case of @{text "x @ z"} being an element in @{term "A\<star>"} |
1425 When we analyse the case of @{text "x @ z"} being an element in @{term "A\<star>"} |
|
1426 and @{text x} is not the empty string, we have the following picture: |
1441 and @{text x} is not the empty string, we have the following picture: |
1427 |
1442 |
1428 \begin{center} |
1443 \begin{center} |
1429 \scalebox{1}{ |
1444 \scalebox{1}{ |
1430 \begin{tikzpicture} |
1445 \begin{tikzpicture} |
1464 \end{center} |
1479 \end{center} |
1465 % |
1480 % |
1466 \noindent |
1481 \noindent |
1467 We can find a strict prefix @{text "x\<^isub>p"} of @{text x} such that @{term "x\<^isub>p \<in> A\<star>"}, |
1482 We can find a strict prefix @{text "x\<^isub>p"} of @{text x} such that @{term "x\<^isub>p \<in> A\<star>"}, |
1468 @{text "x\<^isub>p < x"} and the rest @{term "x\<^isub>s @ z \<in> A\<star>"}. For example the empty string |
1483 @{text "x\<^isub>p < x"} and the rest @{term "x\<^isub>s @ z \<in> A\<star>"}. For example the empty string |
1469 @{text "[]"} would do. |
1484 @{text "[]"} would do (recall @{term "x \<noteq> []"}). |
1470 There are potentially many such prefixes, but there can only be finitely many of them (the |
1485 There are potentially many such prefixes, but there can only be finitely many of them (the |
1471 string @{text x} is finite). Let us therefore choose the longest one and call it |
1486 string @{text x} is finite). Let us therefore choose the longest one and call it |
1472 @{text "x\<^bsub>pmax\<^esub>"}. Now for the rest of the string @{text "x\<^isub>s @ z"} we |
1487 @{text "x\<^bsub>pmax\<^esub>"}. Now for the rest of the string @{text "x\<^isub>s @ z"} we |
1473 know it is in @{term "A\<star>"} and it is not the empty string. By Prop.~\ref{langprops}@{text "(i)"}, |
1488 know it is in @{term "A\<star>"} and cannot be the empty string. By Prop.~\ref{langprops}@{text "(iv)"}, |
1474 we can separate |
1489 we can separate |
1475 this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \<in> A"} |
1490 this string into two parts, say @{text "a"} and @{text "b"}, such that @{text "a \<noteq> []"}, @{text "a \<in> A"} |
1476 and @{term "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x\<^isub>s"}, |
1491 and @{term "b \<in> A\<star>"}. Now @{text a} must be strictly longer than @{text "x\<^isub>s"}, |
1477 otherwise @{text "x\<^bsub>pmax\<^esub>"} is not the longest prefix. That means @{text a} |
1492 otherwise @{text "x\<^bsub>pmax\<^esub>"} is not the longest prefix. That means @{text a} |
1478 `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and |
1493 `overlaps' with @{text z}, splitting it into two components @{text "z\<^isub>a"} and |
1479 @{text "z\<^isub>b"}. For this we know that @{text "x\<^isub>s @ z\<^isub>a \<in> A"} and |
1494 @{text "z\<^isub>b"}. For this we know that @{text "x\<^isub>s @ z\<^isub>a \<in> A"} and |
1480 @{term "z\<^isub>b \<in> A\<star>"}. To cut a story short, we have divided @{term "x @ z \<in> A\<star>"} |
1495 @{term "z\<^isub>b \<in> A\<star>"}. To cut a story short, we have divided @{term "x @ z \<in> A\<star>"} |
1495 @{term "tag_Star A"} is a subset of this set, and therefore finite. |
1510 @{term "tag_Star A"} is a subset of this set, and therefore finite. |
1496 Again we have to show under the assumption @{term "x"}~@{term "=(tag_Star A)="}~@{term y} |
1511 Again we have to show under the assumption @{term "x"}~@{term "=(tag_Star A)="}~@{term y} |
1497 that @{term "x @ z \<in> A\<star>"} implies @{term "y @ z \<in> A\<star>"}. |
1512 that @{term "x @ z \<in> A\<star>"} implies @{term "y @ z \<in> A\<star>"}. |
1498 |
1513 |
1499 We first need to consider the case that @{text x} is the empty string. |
1514 We first need to consider the case that @{text x} is the empty string. |
1500 From the assumption we can infer @{text y} is the empty string and |
1515 From the assumption about strict prefixes in @{text "\<^raw:$\threesim$>\<^bsub>\<star>tag A\<^esub>"}, we |
1501 clearly have @{term "y @ z \<in> A\<star>"}. In case @{text x} is not the empty |
1516 can infer @{text y} is the empty string and |
|
1517 then clearly have @{term "y @ z \<in> A\<star>"}. In case @{text x} is not the empty |
1502 string, we can divide the string @{text "x @ z"} as shown in the picture |
1518 string, we can divide the string @{text "x @ z"} as shown in the picture |
1503 above. By the tagging-function and the facts @{text "x\<^bsub>pmax\<^esub> \<in> A\<^isup>\<star>"} and @{text "x\<^bsub>pmax\<^esub> < x"}, |
1519 above. By the tagging-function and the facts @{text "x\<^bsub>pmax\<^esub> \<in> A\<^isup>\<star>"} and @{text "x\<^bsub>pmax\<^esub> < x"}, |
1504 we have |
1520 we have |
1505 |
1521 |
1506 \begin{center} |
1522 \begin{center} |
1513 \begin{center} |
1529 \begin{center} |
1514 @{text "\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>A\<^esub> \<in> {\<lbrakk>y\<^isub>s\<rbrakk>\<^bsub>\<approx>A\<^esub> | y\<^bsub>p\<^esub> < y \<and> y\<^bsub>p\<^esub> \<in> A\<^isup>\<star> \<and> (y\<^bsub>p\<^esub>, y\<^isub>s) \<in> Partitions y}"} |
1530 @{text "\<lbrakk>x\<^isub>s\<rbrakk>\<^bsub>\<approx>A\<^esub> \<in> {\<lbrakk>y\<^isub>s\<rbrakk>\<^bsub>\<approx>A\<^esub> | y\<^bsub>p\<^esub> < y \<and> y\<^bsub>p\<^esub> \<in> A\<^isup>\<star> \<and> (y\<^bsub>p\<^esub>, y\<^isub>s) \<in> Partitions y}"} |
1515 \end{center} |
1531 \end{center} |
1516 |
1532 |
1517 \noindent |
1533 \noindent |
1518 and we know there exist partitions @{text "y\<^isub>p"} and @{text |
1534 From this we know there exist partitions @{text "y\<^isub>p"} and @{text |
1519 "y\<^isub>s"} with @{term "y\<^isub>p \<in> A\<star>"} and also @{term "x\<^isub>s \<approx>A |
1535 "y\<^isub>s"} with @{term "y\<^isub>p \<in> A\<star>"} and also @{term "x\<^isub>s \<approx>A |
1520 y\<^isub>s"}. Unfolding the Myhill-Nerode relation we know @{term |
1536 y\<^isub>s"}. Unfolding the Myhill-Nerode relation we know @{term |
1521 "y\<^isub>s @ z\<^isub>a \<in> A"}. We also know that @{term "z\<^isub>b \<in> A\<star>"}. |
1537 "y\<^isub>s @ z\<^isub>a \<in> A"}. We also know that @{term "z\<^isub>b \<in> A\<star>"}. |
1522 Therefore @{term "y\<^isub>p @ (y\<^isub>s @ z\<^isub>a) @ z\<^isub>b \<in> |
1538 Therefore @{term "y\<^isub>p @ (y\<^isub>s @ z\<^isub>a) @ z\<^isub>b \<in> |
1523 A\<star>"}, which means @{term "y @ z \<in> A\<star>"}. As the last step we have to set |
1539 A\<star>"}, which means @{term "y @ z \<in> A\<star>"}. As the last step we have to set |
1524 @{text "A"} to @{term "lang r"} and complete the proof. |
1540 @{text "A"} to @{term "lang r"} and thus complete the proof. |
1525 \end{proof} |
1541 \end{proof} |
1526 *} |
1542 *} |
1527 |
1543 |
1528 section {* Second Part based on Partial Derivatives *} |
1544 section {* Second Part proved using Partial Derivatives *} |
1529 |
1545 |
1530 text {* |
1546 text {* |
1531 \noindent |
1547 \noindent |
1532 As we have seen in the previous section, in order to establish |
1548 As we have seen in the previous section, in order to establish |
1533 the second direction of the Myhill-Nerode theorem, we need to find |
1549 the second direction of the Myhill-Nerode theorem, we need to find |
1534 a more refined relation than @{term "\<approx>(lang r)"} for which we can |
1550 a more refined relation than @{term "\<approx>(lang r)"} for which we can |
1535 show that there are only finitely many equivalence classes. So far we |
1551 show that there are only finitely many equivalence classes. So far we |
1536 showed this by induction on @{text "r"}. However, there is also |
1552 showed this by induction on @{text "r"}. However, there is also |
1537 an indirect method to come up with such a refined relation. Assume |
1553 an indirect method to come up with such a refined relation based on |
1538 the following two definitions for a left-quotient of a language, which |
1554 derivatives of regular expressions \cite{Brzozowski64}. |
1539 we write as @{term "Der c A"} and @{term "Ders s A"} where |
1555 |
1540 @{text c} is a character and @{text s} a string: |
1556 Assume the following two definitions for a \emph{left-quotient} of a language, |
|
1557 which we write as @{term "Der c A"} and @{term "Ders s A"} where @{text c} |
|
1558 is a character and @{text s} a string: |
1541 |
1559 |
1542 \begin{center} |
1560 \begin{center} |
1543 \begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{2mm}}l} |
1561 \begin{tabular}{r@ {\hspace{1mm}}c@ {\hspace{2mm}}l} |
1544 @{thm (lhs) Der_def} & @{text "\<equiv>"} & @{thm (rhs) Der_def}\\ |
1562 @{thm (lhs) Der_def} & @{text "\<equiv>"} & @{thm (rhs) Der_def}\\ |
1545 @{thm (lhs) Ders_def} & @{text "\<equiv>"} & @{thm (rhs) Ders_def}\\ |
1563 @{thm (lhs) Ders_def} & @{text "\<equiv>"} & @{thm (rhs) Ders_def}\\ |
1546 \end{tabular} |
1564 \end{tabular} |
1547 \end{center} |
1565 \end{center} |
1548 |
1566 |
1549 \noindent |
1567 \noindent |
|
1568 In order to aid readability, we shall also make use of the following abbreviation: |
|
1569 |
|
1570 \begin{center} |
|
1571 @{abbrev "Derss s A"} |
|
1572 \end{center} |
|
1573 |
|
1574 |
|
1575 \noindent |
1550 Clearly we have the following relation between the Myhill-Nerode relation |
1576 Clearly we have the following relation between the Myhill-Nerode relation |
1551 (Def.~\ref{myhillneroderel}) and left-quotients |
1577 (Def.~\ref{myhillneroderel}) and left-quotients |
1552 |
1578 |
1553 \begin{equation}\label{mhders} |
1579 \begin{equation}\label{mhders} |
1554 @{term "x \<approx>A y"} \hspace{4mm}\text{if and only if}\hspace{4mm} @{term "Ders x A = Ders y A"} |
1580 @{term "x \<approx>A y"} \hspace{4mm}\text{if and only if}\hspace{4mm} @{term "Ders x A = Ders y A"} |
1555 \end{equation} |
1581 \end{equation} |
1556 |
1582 |
1557 \noindent |
1583 \noindent |
1558 It is realtively easy to establish the following properties for left-quotients: |
1584 It is straightforward to establish the following properties for left-quotients: |
1559 |
1585 |
1560 \begin{equation} |
1586 \begin{equation} |
1561 \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{2mm}}l} |
1587 \mbox{\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{2mm}}l} |
1562 @{thm (lhs) Der_zero} & $=$ & @{thm (rhs) Der_zero}\\ |
1588 @{thm (lhs) Der_simps(1)} & $=$ & @{thm (rhs) Der_simps(1)}\\ |
1563 @{thm (lhs) Der_one} & $=$ & @{thm (rhs) Der_one}\\ |
1589 @{thm (lhs) Der_simps(2)} & $=$ & @{thm (rhs) Der_simps(2)}\\ |
1564 @{thm (lhs) Der_atom} & $=$ & @{thm (rhs) Der_atom}\\ |
1590 @{thm (lhs) Der_simps(3)} & $=$ & @{thm (rhs) Der_simps(3)}\\ |
1565 @{thm (lhs) Der_union} & $=$ & @{thm (rhs) Der_union}\\ |
1591 @{thm (lhs) Der_simps(4)} & $=$ & @{thm (rhs) Der_simps(4)}\\ |
1566 @{thm (lhs) Der_conc} & $=$ & @{thm (rhs) Der_conc}\\ |
1592 @{thm (lhs) Der_conc} & $=$ & @{thm (rhs) Der_conc}\\ |
1567 @{thm (lhs) Der_star} & $=$ & @{thm (rhs) Der_star}\\ |
1593 @{thm (lhs) Der_star} & $=$ & @{thm (rhs) Der_star}\\ |
1568 @{thm (lhs) Ders_nil} & $=$ & @{thm (rhs) Ders_nil}\\ |
1594 @{thm (lhs) Ders_simps(1)} & $=$ & @{thm (rhs) Ders_simps(1)}\\ |
1569 @{thm (lhs) Ders_cons} & $=$ & @{thm (rhs) Ders_cons}\\ |
1595 @{thm (lhs) Ders_simps(2)} & $=$ & @{thm (rhs) Ders_simps(2)}\\ |
1570 %@{thm (lhs) Ders_append[where ?s1.0="s\<^isub>1" and ?s2.0="s\<^isub>2"]} & $=$ |
1596 %@{thm (lhs) Ders_simps(3)[where ?s1.0="s\<^isub>1" and ?s2.0="s\<^isub>2"]} & $=$ |
1571 % & @{thm (rhs) Ders_append[where ?s1.0="s\<^isub>1" and ?s2.0="s\<^isub>2"]}\\ |
1597 % & @{thm (rhs) Ders_simps(3)[where ?s1.0="s\<^isub>1" and ?s2.0="s\<^isub>2"]}\\ |
1572 \end{tabular}} |
1598 \end{tabular}} |
1573 \end{equation} |
1599 \end{equation} |
1574 |
1600 |
1575 \noindent |
1601 \noindent |
1576 where @{text "\<Delta>"} is a function that tests whether the empty string |
1602 where @{text "\<Delta>"} is a function that tests whether the empty string |
1637 \end{tabular}} |
1663 \end{tabular}} |
1638 \end{equation} |
1664 \end{equation} |
1639 |
1665 |
1640 \noindent |
1666 \noindent |
1641 The importance in the context of the Myhill-Nerode theorem is that |
1667 The importance in the context of the Myhill-Nerode theorem is that |
1642 we can use \eqref{mhders} and the equations above in order to |
1668 we can use \eqref{mhders} and \eqref{Dersders} in order to |
1643 establish that @{term "x \<approx>(lang r) y"} if and only if |
1669 establish that @{term "x \<approx>(lang r) y"} is equivalent to |
1644 @{term "lang (ders x r) = lang (ders y r)"}. From this we obtain |
1670 @{term "lang (ders x r) = lang (ders y r)"}. From this we obtain |
1645 |
1671 |
1646 \begin{equation} |
1672 \begin{equation} |
1647 @{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "ders x r = ders y r"} |
1673 @{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "ders x r = ders y r"} |
1648 \end{equation} |
1674 \end{equation} |
1649 |
1675 |
1650 |
1676 |
1651 \noindent |
1677 \noindent |
1652 Consequently, we can use as the tagging relation @{text |
1678 which means the right-hand side (seen as relation) refines the |
1653 "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"}, since it refines |
1679 Myhill-Nerode relation. Consequently, we can use |
1654 @{term "\<approx>(lang r)"}. However, in order to be useful in the Myhill-Nerode |
1680 @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as a potential tagging-relation |
1655 theorem, we have to show that for a language there are only finitely many |
1681 for the regular expression @{text r}. However, in |
1656 derivatives. Unfortunately, this is not true in general: Sakarovitch gives |
1682 order to be useful in the Myhill-Nerode theorem, we also have to show that |
1657 an example with infinitely many derivatives |
1683 for the corresponding language there are only finitely many derivatives---ensuring |
1658 \cite[Page~141]{Sakarovitch09}. What Brzozowski \cite{Brzozowski64} proved is |
1684 that there are only finitely many equivalence classes. Unfortunately, this |
1659 that for every language there \emph{are} finitely `dissimilar' derivatives for a |
1685 is not true in general. Sakarovitch gives an example where a regular |
1660 regular expression. Two regular expressions are said to be \emph{similar} |
1686 expression has infinitely many derivatives w.r.t.~a language |
1661 provided they can be identified using the using the @{text "ACI"}-identities: |
1687 \cite[Page~141]{Sakarovitch09}. What Brzozowski \cite{Brzozowski64} proved |
1662 |
1688 is that for every language there \emph{are} only finitely `dissimilar' |
1663 \begin{center} |
1689 derivatives for a regular expression. Two regular expressions are said to be |
1664 \begin{tabular}{cl} |
1690 \emph{similar} provided they can be identified using the using the @{text |
|
1691 "ACI"}-identities: |
|
1692 |
|
1693 \begin{equation}\label{ACI} |
|
1694 \mbox{\begin{tabular}{cl} |
1665 (@{text A}) & @{term "Plus (Plus r\<^isub>1 r\<^isub>2) r\<^isub>3"} $\equiv$ @{term "Plus r\<^isub>1 (Plus r\<^isub>2 r\<^isub>3)"}\\ |
1695 (@{text A}) & @{term "Plus (Plus r\<^isub>1 r\<^isub>2) r\<^isub>3"} $\equiv$ @{term "Plus r\<^isub>1 (Plus r\<^isub>2 r\<^isub>3)"}\\ |
1666 (@{text C}) & @{term "Plus r\<^isub>1 r\<^isub>2"} $\equiv$ @{term "Plus r\<^isub>2 r\<^isub>1"}\\ |
1696 (@{text C}) & @{term "Plus r\<^isub>1 r\<^isub>2"} $\equiv$ @{term "Plus r\<^isub>2 r\<^isub>1"}\\ |
1667 (@{text I}) & @{term "Plus r r"} $\equiv$ @{term "r"}\\ |
1697 (@{text I}) & @{term "Plus r r"} $\equiv$ @{term "r"}\\ |
1668 \end{tabular} |
1698 \end{tabular}} |
1669 \end{center} |
1699 \end{equation} |
1670 |
1700 |
1671 \noindent |
1701 \noindent |
1672 Carrying this idea through menas we now have to reasoning modulo @{text "ACI"}. |
1702 Carrying this idea through, we must not consider the set of all derivatives, |
1673 This can be done, but it is very painful in a theorem prover (since there is |
1703 but the ones modulo @{text "ACI"}. In principle, this can be formally |
1674 no direct characterisation of the set of dissimlar derivatives). |
1704 defined, but it is very painful in a theorem prover (since there is no |
|
1705 direct characterisation of the set of dissimlar derivatives). |
|
1706 |
1675 |
1707 |
1676 Fortunately, there is a much simpler approach using \emph{partial |
1708 Fortunately, there is a much simpler approach using \emph{partial |
1677 derivatives}. They were introduced by Antimirov \cite{Antimirov95} and can be defined |
1709 derivatives}. They were introduced by Antimirov \cite{Antimirov95} and can be defined |
1678 in Isabelle/HOL as follows: |
1710 in Isabelle/HOL as follows: |
1679 |
1711 |
1684 @{thm (lhs) pder.simps(3)[where c'="d"]} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(3)[where c'="d"]}\\ |
1716 @{thm (lhs) pder.simps(3)[where c'="d"]} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(3)[where c'="d"]}\\ |
1685 @{thm (lhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} |
1717 @{thm (lhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} |
1686 & @{text "\<equiv>"} & @{thm (rhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\ |
1718 & @{text "\<equiv>"} & @{thm (rhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\ |
1687 @{thm (lhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} |
1719 @{thm (lhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} |
1688 & @{text "\<equiv>"} & @{text "if"}~@{term "nullable r\<^isub>1"}~@{text "then"}~% |
1720 & @{text "\<equiv>"} & @{text "if"}~@{term "nullable r\<^isub>1"}~@{text "then"}~% |
1689 @{term "(Times_set (pder c r\<^isub>1) r\<^isub>2) \<union> (pder c r\<^isub>2)"}\\ |
1721 @{term "(Timess (pder c r\<^isub>1) r\<^isub>2) \<union> (pder c r\<^isub>2)"}\\ |
1690 & & \phantom{@{text "if"}~@{term "nullable r\<^isub>1"}~}@{text "else"}~% |
1722 & & \phantom{@{text "if"}~@{term "nullable r\<^isub>1"}~}@{text "else"}~% |
1691 @{term "Times_set (pder c r\<^isub>1) r\<^isub>2"}\\ |
1723 @{term "Timess (pder c r\<^isub>1) r\<^isub>2"}\\ |
1692 @{thm (lhs) pder.simps(6)} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(6)}\smallskip\\ |
1724 @{thm (lhs) pder.simps(6)} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(6)}\smallskip\\ |
1693 @{thm (lhs) pders.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) pders.simps(1)}\\ |
1725 @{thm (lhs) pders.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) pders.simps(1)}\\ |
|
1726 @{thm (lhs) pders.simps(2)} & @{text "\<equiv>"} & @{text "\<Union> (pders s) ` (pder c r)"}\\ |
|
1727 \end{tabular} |
|
1728 \end{center} |
|
1729 |
|
1730 \noindent |
|
1731 Again the last two clauses extend partial derivatives from characters to strings. |
|
1732 Unlike `simple' derivatives, the functions for partial derivatives return sets of regular |
|
1733 expressions. In the @{const Times} and @{const Star} cases we therefore use the |
|
1734 auxiliary definition |
|
1735 |
|
1736 \begin{center} |
|
1737 @{text "TIMESS rs r \<equiv> {TIMES r' r | r' \<in> rs}"} |
|
1738 \end{center} |
|
1739 |
|
1740 \noindent |
|
1741 in order to `sequence' a regular expression with a set of regular |
|
1742 expressions. Note that in the last clause we first build the set of partial |
|
1743 derivatives w.r.t~the character @{text c}, then build the image of this set under the |
|
1744 function @{term "pders s"} and finally `union up' all resulting sets. It will be |
|
1745 convenient to introduce the following abbreviation |
|
1746 |
|
1747 \begin{center} |
|
1748 @{abbrev "pderss s A"} |
|
1749 \end{center} |
|
1750 |
|
1751 \noindent |
|
1752 which simplifies the last clause of @{const "pders"} to |
|
1753 |
|
1754 \begin{center} |
|
1755 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}} |
1694 @{thm (lhs) pders.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) pders.simps(2)}\\ |
1756 @{thm (lhs) pders.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) pders.simps(2)}\\ |
1695 \end{tabular} |
1757 \end{tabular} |
1696 \end{center} |
1758 \end{center} |
1697 |
1759 |
1698 \noindent |
1760 Partial derivatives can be seen as having the @{text "ACI"}-identities already built in: |
1699 Unlike `simple' derivatives, these functions return a set of regular |
1761 taking the partial derivatives of the |
1700 expressions. In the @{const Times} and @{const Star} cases we use |
1762 regular expressions in \eqref{ACI} gives us in each case |
1701 |
1763 equal sets. Antimirov \cite{Antimirov95} showed a similar result to |
1702 \begin{center} |
1764 \eqref{Dersders} for partial derivatives: |
1703 @{text "TIMESS rs r \<equiv> {TIMES r' r | r' \<in> rs}"} |
|
1704 \end{center} |
|
1705 |
|
1706 \noindent |
|
1707 in order to `sequence' a regular expressions with respect to a set of regular |
|
1708 expresions. Note that in the last case we first build the set of partial derivatives |
|
1709 w.r.t~@{text c}, then build the image of this set under the function @{term "pders s"} |
|
1710 and finally `union up' all resulting sets. Note also that in some sense, partial |
|
1711 derivatives have the @{text "ACI"}-identities already build in. Antimirov |
|
1712 \cite{Antimirov95} |
|
1713 showed |
|
1714 |
1765 |
1715 \begin{equation} |
1766 \begin{equation} |
1716 \mbox{\begin{tabular}{c} |
1767 \mbox{\begin{tabular}{lc} |
1717 @{thm Der_pder}\\ |
1768 @{text "(i)"} & @{thm Der_pder}\\ |
1718 @{thm Ders_pders} |
1769 @{text "(ii)"} & @{thm Ders_pders} |
1719 \end{tabular}} |
1770 \end{tabular}} |
1720 \end{equation} |
1771 \end{equation} |
1721 |
1772 |
1722 \noindent |
1773 \begin{proof} |
1723 and proved that for every language and regular expression there are only finitely |
1774 The first fact is by a simple induction on @{text r}. For the second we slightly |
|
1775 modify Antimirov's proof by performing an induction on @{text s} where we |
|
1776 genaralise over all @{text r}. That means in the @{text "cons"}-case the |
|
1777 induction hypothesis is |
|
1778 |
|
1779 \begin{center} |
|
1780 @{text "(IH)"}\hspace{3mm}@{term "\<forall>r. Ders s (lang r) = \<Union> lang ` (pders s r)"} |
|
1781 \end{center} |
|
1782 |
|
1783 \noindent |
|
1784 With this we can establish |
|
1785 |
|
1786 \begin{center} |
|
1787 \begin{tabular}{r@ {\hspace{1.5mm}}c@ {\hspace{1.5mm}}ll} |
|
1788 @{term "Ders (c # s) (lang r)"} |
|
1789 & @{text "="} & @{term "Ders s (Der c (lang r))"} & by def.\\ |
|
1790 & @{text "="} & @{term "Ders s (\<Union> lang ` (pder c r))"} & by @{text "(i)"}\\ |
|
1791 & @{text "="} & @{term "\<Union> (Ders s) ` (lang ` (pder c r))"} & by def.~of @{text "Ders"}\\ |
|
1792 & @{text "="} & @{term "\<Union> lang ` (\<Union> pders s ` (pder c r))"} & by IH\\ |
|
1793 & @{text "="} & @{term "\<Union> lang ` (pders (c # s) r)"} & by def.\\ |
|
1794 \end{tabular} |
|
1795 \end{center} |
|
1796 |
|
1797 \noindent |
|
1798 In order to apply the induction hypothesis in the fourth step, we need the generalisation |
|
1799 over all regular expressions @{text r}. The case for the empty string is routine and omitted. |
|
1800 \end{proof} |
|
1801 |
|
1802 Antimirov also proved that for every language and regular expression there are only finitely |
1724 many partial derivatives. |
1803 many partial derivatives. |
1725 *} |
1804 *} |
1726 |
1805 |
1727 section {* Closure Properties *} |
1806 section {* Closure Properties *} |
1728 |
1807 |
1729 text {* |
1808 text {* |
1730 The beautiful characteristics of regular languages is that they are closed |
1809 \noindent |
1731 under many operations. Closure under union, sequencencing and Kleene-star |
1810 The real beauty of regular languages is that they are closed |
|
1811 under almost all set operations. Closure under union, concatenation and Kleene-star |
1732 are trivial to establish given our definition of regularity (Def.~\ref{regular}). |
1812 are trivial to establish given our definition of regularity (Def.~\ref{regular}). |
1733 More interesting is the closure under complement, because |
1813 More interesting is the closure under complement, because |
1734 it seems difficult to construct a regular expression for the complement |
1814 it seems difficult to construct a regular expression for the complement |
1735 language by direct means. However the existence of such a regular expression |
1815 language by direct means. However the existence of such a regular expression |
1736 can now be easily proved using the Myhill-Nerode theorem since |
1816 can now be easily proved using the Myhill-Nerode theorem since |