32 abbreviation "ZERO \<equiv> Zero" |
32 abbreviation "ZERO \<equiv> Zero" |
33 abbreviation "ONE \<equiv> One" |
33 abbreviation "ONE \<equiv> One" |
34 abbreviation "ATOM \<equiv> Atom" |
34 abbreviation "ATOM \<equiv> Atom" |
35 abbreviation "PLUS \<equiv> Plus" |
35 abbreviation "PLUS \<equiv> Plus" |
36 abbreviation "TIMES \<equiv> Times" |
36 abbreviation "TIMES \<equiv> Times" |
|
37 abbreviation "TIMESS \<equiv> Times_set" |
37 abbreviation "STAR \<equiv> Star" |
38 abbreviation "STAR \<equiv> Star" |
38 |
39 |
39 |
40 |
40 notation (latex output) |
41 notation (latex output) |
41 str_eq_rel ("\<approx>\<^bsub>_\<^esub>") and |
42 str_eq_rel ("\<approx>\<^bsub>_\<^esub>") and |
128 text {* |
129 text {* |
129 \noindent |
130 \noindent |
130 Regular languages are an important and well-understood subject in Computer |
131 Regular languages are an important and well-understood subject in Computer |
131 Science, with many beautiful theorems and many useful algorithms. There is a |
132 Science, with many beautiful theorems and many useful algorithms. There is a |
132 wide range of textbooks on this subject, many of which are aimed at students |
133 wide range of textbooks on this subject, many of which are aimed at students |
133 and contain very detailed `pencil-and-paper' proofs |
134 and contain very detailed `pencil-and-paper' proofs (e.g.~\cite{Kozen97, |
134 (e.g.~\cite{Kozen97, HopcroftUllman69}). It seems natural to exercise theorem provers by |
135 HopcroftUllman69}). It seems natural to exercise theorem provers by |
135 formalising the theorems and by verifying formally the algorithms. A |
136 formalising the theorems and by verifying formally the algorithms. A |
136 popular choice for a theorem prover would be one based on Higher-Order Logic |
137 popular choice for a theorem prover would be one based on Higher-Order Logic |
137 (HOL), for example HOL4, HOLlight and Isabelle/HOL. For our development |
138 (HOL), for example HOL4, HOLlight or Isabelle/HOL. For the development |
138 we will use the latter. One distinguishing feature of HOL is it's |
139 presented in this paper we will use the latter. HOL is a predicate calculus |
139 type system, which is based on Church's Simple Theory of Types \cite{Church40}. The |
140 that allows quantification over predicate variables. Its type system is |
140 limitations of this type system are one of the underlying motivations for the |
141 based on Church's Simple Theory of Types \cite{Church40}. Although |
141 work presented in this paper. |
142 many mathematical concepts can be conveniently expressed in HOL, there are some |
142 |
143 limitations that hurt badly, if one attempts a simple-minded formalisation |
143 The typical approach to regular languages is to |
144 of regular languages in it. |
144 introduce finite automata and then define everything in terms of them. For |
145 |
145 example, a regular language is normally defined as one whose strings are |
146 The typical approach to regular languages is to introduce finite automata |
146 recognised by a finite deterministic automaton. This approach has many |
147 and then define everything in terms of them \cite{Kozen97}. For example, |
147 benefits. Among them is the fact that it is easy to convince oneself that |
148 a regular language is normally defined as: |
148 regular languages are closed under complementation: one just has to exchange |
149 |
149 the accepting and non-accepting states in the corresponding automaton to |
150 \begin{dfntn}\label{baddef} |
150 obtain an automaton for the complement language. The problem, however, lies |
151 A language @{text A} is \emph{regular}, provided there is a |
151 with formalising such reasoning in a HOL-based theorem prover. Automata are |
152 finite deterministic automaton that recognises all strings of @{text "A"}. |
152 built up from states and transitions that need to be represented as graphs, |
153 \end{dfntn} |
153 matrices or functions, none of which can be defined as an inductive |
154 |
154 datatype. |
155 \noindent |
|
156 This approach has many benefits. Among them is the fact that it is easy to |
|
157 convince oneself that regular languages are closed under complementation: |
|
158 one just has to exchange the accepting and non-accepting states in the |
|
159 corresponding automaton to obtain an automaton for the complement language. |
|
160 The problem, however, lies with formalising such reasoning in a HOL-based |
|
161 theorem prover. Automata are built up from states and transitions that need |
|
162 to be represented as graphs, matrices or functions, none of which can be |
|
163 defined as an inductive datatype. |
155 |
164 |
156 In case of graphs and matrices, this means we have to build our own |
165 In case of graphs and matrices, this means we have to build our own |
157 reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor |
166 reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor |
158 HOLlight support them with libraries. Even worse, reasoning about graphs and |
167 HOLlight support them with libraries. Even worse, reasoning about graphs and |
159 matrices can be a real hassle in HOL-based theorem provers, because |
168 matrices can be a real hassle in HOL-based theorem provers, because |
229 \noindent |
238 \noindent |
230 changes the type---the disjoint union is not a set, but a set of |
239 changes the type---the disjoint union is not a set, but a set of |
231 pairs. Using this definition for disjoint union means we do not have a |
240 pairs. Using this definition for disjoint union means we do not have a |
232 single type for automata. As a result we will not be able to define a regular |
241 single type for automata. As a result we will not be able to define a regular |
233 language as one for which there exists an automaton that recognises all its |
242 language as one for which there exists an automaton that recognises all its |
234 strings, since there is no type quantification available in HOL (unlike in Coq, for |
243 strings. This is because we cannot make a definition in HOL that is polymorphic in |
235 example). |
244 the state type and also there is no type quantification available in HOL (unlike |
|
245 in Coq, for example). |
236 |
246 |
237 An alternative, which provides us with a single type for automata, is to give every |
247 An alternative, which provides us with a single type for automata, is to give every |
238 state node an identity, for example a natural |
248 state node an identity, for example a natural |
239 number, and then be careful to rename these identities apart whenever |
249 number, and then be careful to rename these identities apart whenever |
240 connecting two automata. This results in clunky proofs |
250 connecting two automata. This results in clunky proofs |
283 lexing. Berghofer and Reiter \cite{BerghoferReiter09} formalise automata |
293 lexing. Berghofer and Reiter \cite{BerghoferReiter09} formalise automata |
284 working over bit strings in the context of Presburger arithmetic. The only |
294 working over bit strings in the context of Presburger arithmetic. The only |
285 larger formalisations of automata theory are carried out in Nuprl |
295 larger formalisations of automata theory are carried out in Nuprl |
286 \cite{Constable00} and in Coq \cite{Filliatre97}. |
296 \cite{Constable00} and in Coq \cite{Filliatre97}. |
287 |
297 |
288 One might also consider the Myhill-Nerode theorem as well-worn stock |
298 Also one might consider automata theory as well-worn stock material where |
289 material where everything is clear. However, paper proofs of this theorem |
299 everything is crystal clear. However, paper proofs about automata often |
290 often involve subtle side-conditions which are easily overlooked, but which |
300 involve subtle side-conditions which are easily overlooked, but which make |
291 make formal reasoning rather painful. For example Kozen's proof requires |
301 formal reasoning rather painful. For example Kozen's proof of the |
292 that the automata do not have inaccessible states \cite{Kozen97}. Another |
302 Myhill-Nerode theorem requires that the automata do not have inaccessible |
293 subtle side-condition is completeness of automata: |
303 states \cite{Kozen97}. Another subtle side-condition is completeness of |
294 automata need to have total transition functions and at most one `sink' |
304 automata: automata need to have total transition functions and at most one |
295 state from which there is no connection to a final state (Brozowski mentions |
305 `sink' state from which there is no connection to a final state (Brozowski |
296 this side-condition in connection with state complexity |
306 mentions this side-condition in connection with state complexity |
297 \cite{Brozowski10}). Such side-conditions mean that if we define a regular |
307 \cite{Brozowski10}). Such side-conditions mean that if we define a regular |
298 language as one for which there exists \emph{any} finite automaton, then we |
308 language as one for which there exists \emph{a} finite automaton that |
299 need a lemma which ensures that another equivalent can be found satisfying the |
309 recognises all its strings (Def.~\ref{baddef}), then we need a lemma which |
300 side-condition. Unfortunately, such `little' and `obvious' lemmas make |
310 ensures that another equivalent can be found satisfying the |
301 formalisations of results in automata theory hair-pulling experiences. |
311 side-condition. Unfortunately, such `little' and `obvious' lemmas make |
302 |
312 formalisations of automata theory hair-pulling experiences. |
303 |
313 |
|
314 |
304 In this paper, we will not attempt to formalise automata theory in |
315 In this paper, we will not attempt to formalise automata theory in |
305 Isabelle/HOL nor will we attempt to formalise automata proofs from the |
316 Isabelle/HOL nor will we attempt to formalise automata proofs from the |
306 literature, but take a different approach to regular languages than is |
317 literature, but take a different approach to regular languages than is |
307 usually taken. Instead of defining a regular language as one where there |
318 usually taken. Instead of defining a regular language as one where there |
308 exists an automaton that recognises all strings of the language, we define a |
319 exists an automaton that recognises all strings of the language, we define a |
313 strings of @{text "A"}. |
324 strings of @{text "A"}. |
314 \end{dfntn} |
325 \end{dfntn} |
315 |
326 |
316 \noindent |
327 \noindent |
317 The reason is that regular expressions, unlike graphs, matrices and |
328 The reason is that regular expressions, unlike graphs, matrices and |
318 functions, can be easily defined as an inductive datatype. No side-conditions |
329 functions, can be easily defined as an inductive datatype. A reasoning |
319 will be needed for regular expressions. Moreover, a reasoning infrastructure |
330 infrastructure (like induction and recursion) comes then for free in |
320 (like induction and recursion) comes for free in HOL-based theorem provers. |
331 HOL. Moreover, no side-conditions will be needed for regular expressions, |
321 This has recently been exploited in HOL4 with a formalisation of |
332 like we usually need for automata. This convenience of regular expressions has |
322 regular expression matching based on derivatives \cite{OwensSlind08} and |
333 recently been exploited in HOL4 with a formalisation of regular expression |
323 with an equivalence checker for regular expressions in Isabelle/HOL |
334 matching based on derivatives \cite{OwensSlind08} and with an equivalence |
324 \cite{KraussNipkow11}. The main purpose of this paper is to show that a central |
335 checker for regular expressions in Isabelle/HOL \cite{KraussNipkow11}. The |
325 result about regular languages---the Myhill-Nerode theorem---can be |
336 main purpose of this paper is to show that a central result about regular |
326 recreated by only using regular expressions. This theorem gives necessary |
337 languages---the Myhill-Nerode theorem---can be recreated by only using |
327 and sufficient conditions for when a language is regular. As a corollary of |
338 regular expressions. This theorem gives necessary and sufficient conditions |
328 this theorem we can easily establish the usual closure properties, including |
339 for when a language is regular. As a corollary of this theorem we can easily |
329 complementation, for regular languages.\medskip |
340 establish the usual closure properties, including complementation, for |
|
341 regular languages.\medskip |
330 |
342 |
331 \noindent |
343 \noindent |
332 {\bf Contributions:} There is an extensive literature on regular |
344 {\bf Contributions:} There is an extensive literature on regular languages. |
333 languages. To our best knowledge, our proof of the Myhill-Nerode theorem is |
345 To our best knowledge, our proof of the Myhill-Nerode theorem is the first |
334 the first that is based on regular expressions, only. The part of this |
346 that is based on regular expressions, only. The part of this theorem stating |
335 theorem stating that finitely many partitions imply regularity of the |
347 that finitely many partitions imply regularity of the language is proved by |
336 language is proved by an argument about solving equational sytems. This |
348 an argument about solving equational sytems. This argument appears to be |
337 argument appears to be folklore. For the other part, we give two proofs: one |
349 folklore. For the other part, we give two proofs: one direct proof using |
338 direct proof using certain tagging-functions, and another indirect proof |
350 certain tagging-functions, and another indirect proof using Antimirov's |
339 using Antimirov's partial derivatives \cite{Antimirov95}. Again to our best |
351 partial derivatives \cite{Antimirov95}. Again to our best knowledge, the |
340 knowledge, the tagging-functions have not been used before to establish the |
352 tagging-functions have not been used before to establish the Myhill-Nerode |
341 Myhill-Nerode theorem. Derivatives of regular expressions have been used |
353 theorem. Derivatives of regular expressions have been used widely in the |
342 extensively in the literature, unlike partial derivatives. However, partial |
354 literature about regular expressions. However, partial derivatives are more |
343 derivatives are more suitable in the context of the Myhill-Nerode theorem, |
355 suitable in the context of the Myhill-Nerode theorem, since it is easier to |
344 since it is easier to establish formally their finiteness result. |
356 establish formally their finiteness result. |
|
357 |
345 *} |
358 *} |
346 |
359 |
347 section {* Preliminaries *} |
360 section {* Preliminaries *} |
348 |
361 |
349 text {* |
362 text {* |
1542 \noindent |
1555 \noindent |
1543 Without the indentification, we unfortunately obtain infinitely many |
1556 Without the indentification, we unfortunately obtain infinitely many |
1544 derivations (an example is given in \cite[Page~141]{Sakarovitch09}). |
1557 derivations (an example is given in \cite[Page~141]{Sakarovitch09}). |
1545 Reasoning modulo ACI can be done, but it is very painful in a theorem prover. |
1558 Reasoning modulo ACI can be done, but it is very painful in a theorem prover. |
1546 |
1559 |
1547 |
1560 Fortunately, there is a much simpler approach using partial |
1548 in order to prove the second |
1561 derivatives introduced by Antimirov \cite{Antimirov95}. |
1549 direction of the Myhill-Nerode theorem. There he calculates the |
1562 |
1550 derivatives for regular expressions and shows that for every |
1563 \begin{center} |
1551 language there can be only finitely many of them %derivations (if |
1564 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}} |
1552 regarded equal modulo ACI). We could have used as tagging-function |
1565 @{thm (lhs) pder.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(1)}\\ |
1553 the set of derivatives of a regular expression with respect to a |
1566 @{thm (lhs) pder.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(2)}\\ |
1554 language. Using the fact that two strings are Myhill-Nerode related |
1567 @{thm (lhs) pder.simps(3)[where c'="d"]} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(3)[where c'="d"]}\\ |
1555 whenever their derivative is the same, together with the fact that |
1568 @{thm (lhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} |
1556 there are only finitely such derivatives would give us a similar |
1569 & @{text "\<equiv>"} & @{thm (rhs) pder.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\ |
1557 argument as ours. However it seems not so easy to calculate the set |
1570 @{thm (lhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} |
1558 of derivatives modulo ACI. Therefore we preferred our direct method |
1571 & @{text "\<equiv>"}\\ |
1559 of using tagging-functions. |
1572 \multicolumn{3}{@ {\hspace{20mm}}l@ {}}{@{thm (rhs) pder.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}}\\ |
1560 |
1573 @{thm (lhs) pder.simps(6)} & @{text "\<equiv>"} & @{thm (rhs) pder.simps(6)}\smallskip\\ |
1561 The problem of finiteness modulo ACI can be avoided by using partial |
1574 @{thm (lhs) pders.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) pders.simps(1)}\\ |
1562 derivatives introduced by Antimirov \cite{Antimirov}. |
1575 @{thm (lhs) pders.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) pders.simps(2)}\\ |
|
1576 \end{tabular} |
|
1577 \end{center} |
1563 |
1578 |
1564 *} |
1579 *} |
1565 |
1580 |
1566 section {* Closure Properties *} |
1581 section {* Closure Properties *} |
1567 |
1582 |