155 based on Church's Simple Theory of Types \cite{Church40}. Although many |
155 based on Church's Simple Theory of Types \cite{Church40}. Although many |
156 mathematical concepts can be conveniently expressed in HOL, there are some |
156 mathematical concepts can be conveniently expressed in HOL, there are some |
157 limitations that hurt badly, if one attempts a simple-minded formalisation |
157 limitations that hurt badly, if one attempts a simple-minded formalisation |
158 of regular languages in it. |
158 of regular languages in it. |
159 |
159 |
160 The typical approach to regular languages is to |
160 The typical approach to regular languages is to introduce finite |
161 introduce finite automata and then define everything in terms of them |
161 deterministic automata and then define everything in terms of them \cite{ |
162 \cite{ HopcroftUllman69,Kozen97}. For example, a regular language is |
162 HopcroftUllman69,Kozen97}. For example, a regular language is normally |
163 normally defined as: |
163 defined as: |
164 |
164 |
165 \begin{dfntn}\label{baddef} |
165 \begin{dfntn}\label{baddef} |
166 A language @{text A} is \emph{regular}, provided there is a |
166 A language @{text A} is \emph{regular}, provided there is a |
167 finite deterministic automaton that recognises all strings of @{text "A"}. |
167 finite deterministic automaton that recognises all strings of @{text "A"}. |
168 \end{dfntn} |
168 \end{dfntn} |
250 \end{equation} |
250 \end{equation} |
251 |
251 |
252 \noindent |
252 \noindent |
253 changes the type---the disjoint union is not a set, but a set of |
253 changes the type---the disjoint union is not a set, but a set of |
254 pairs. Using this definition for disjoint union means we do not have a |
254 pairs. Using this definition for disjoint union means we do not have a |
255 single type for the states of automata. As a result we will not be able to define a |
255 single type for the states of automata. As a result we will not be able to |
256 regular language as one for which there exists an automaton that recognises |
256 define in our fomalisation a regular language as one for which there exists |
257 all its strings (Definition~\ref{baddef}). This is because we cannot make a |
257 an automaton that recognises all its strings (Definition~\ref{baddef}). This |
258 definition in HOL that is only polymorphic in the state type and there is no type |
258 is because we cannot make a definition in HOL that is only polymorphic in |
259 quantification available in HOL (unlike in Coq, for example).\footnote{Slind |
259 the state type, but not in the predicate for regularity; and there is no |
260 already pointed out this problem in an email to the HOL4 mailing list on |
260 type quantification available in HOL (unlike in Coq, for |
261 21st April 2005.} |
261 example).\footnote{Slind already pointed out this problem in an email to the |
|
262 HOL4 mailing list on 21st April 2005.} |
262 |
263 |
263 An alternative, which provides us with a single type for states of automata, |
264 An alternative, which provides us with a single type for states of automata, |
264 is to give every state node an identity, for example a natural number, and |
265 is to give every state node an identity, for example a natural number, and |
265 then be careful to rename these identities apart whenever connecting two |
266 then be careful to rename these identities apart whenever connecting two |
266 automata. This results in clunky proofs establishing that properties are |
267 automata. This results in clunky proofs establishing that properties are |
267 invariant under renaming. Similarly, connecting two automata represented as |
268 invariant under renaming. Similarly, connecting two automata represented as |
268 matrices results in very adhoc constructions, which are not pleasant to |
269 matrices results in very adhoc constructions, which are not pleasant to |
269 reason about. |
270 reason about. |
270 |
271 |
271 Functions are much better supported in Isabelle/HOL, but they still lead to similar |
272 Functions are much better supported in Isabelle/HOL, but they still lead to |
272 problems as with graphs. Composing, for example, two non-deterministic automata in parallel |
273 similar problems as with graphs. Composing, for example, two |
273 requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98} |
274 non-deterministic automata in parallel requires also the formalisation of |
274 dismisses for this the option of using identities, because it leads according to |
275 disjoint unions. Nipkow \cite{Nipkow98} dismisses for this the option of |
275 him to ``messy proofs''. Since he does not need to define what regular |
276 using identities, because it leads according to him to ``messy |
276 languages are, Nipkow opts for a variant of \eqref{disjointunion} using bit lists, but writes |
277 proofs''. Since he does not need to define what regular languages are, |
|
278 Nipkow opts for a variant of \eqref{disjointunion} using bit lists, but |
|
279 writes\smallskip |
|
280 |
277 |
281 |
278 \begin{quote} |
282 \begin{quote} |
279 \it% |
283 \it% |
280 \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}} |
284 \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}} |
281 `` & All lemmas appear obvious given a picture of the composition of automata\ldots |
285 `` & All lemmas appear obvious given a picture of the composition of automata\ldots |
282 Yet their proofs require a painful amount of detail.'' |
286 Yet their proofs require a painful amount of detail.'' |
283 \end{tabular} |
287 \end{tabular} |
284 \end{quote} |
288 \end{quote}\smallskip |
285 |
289 |
286 \noindent |
290 \noindent |
287 and |
291 and\smallskip |
288 |
292 |
289 \begin{quote} |
293 \begin{quote} |
290 \it% |
294 \it% |
291 \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}} |
295 \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}} |
292 `` & If the reader finds the above treatment in terms of bit lists revoltingly |
296 `` & If the reader finds the above treatment in terms of bit lists revoltingly |
293 concrete, I cannot disagree. A more abstract approach is clearly desirable.'' |
297 concrete, I cannot disagree. A more abstract approach is clearly desirable.'' |
294 \end{tabular} |
298 \end{tabular} |
295 \end{quote} |
299 \end{quote}\smallskip |
296 |
300 |
297 |
301 |
298 \noindent |
302 \noindent |
299 Moreover, it is not so clear how to conveniently impose a finiteness |
303 Moreover, it is not so clear how to conveniently impose a finiteness |
300 condition upon functions in order to represent \emph{finite} automata. The |
304 condition upon functions in order to represent \emph{finite} automata. The |
1313 this string to be in @{term "A \<cdot> B"}: |
1317 this string to be in @{term "A \<cdot> B"}: |
1314 % |
1318 % |
1315 \begin{center} |
1319 \begin{center} |
1316 \begin{tabular}{c} |
1320 \begin{tabular}{c} |
1317 \scalebox{1}{ |
1321 \scalebox{1}{ |
1318 \begin{tikzpicture} |
1322 \begin{tikzpicture}[fill=gray!20] |
1319 \node[draw,minimum height=3.8ex] (x) { $\hspace{4.8em}@{text x}\hspace{4.8em}$ }; |
1323 \node[draw,minimum height=3.8ex, fill] (x) { $\hspace{4.8em}@{text x}\hspace{4.8em}$ }; |
1320 \node[draw,minimum height=3.8ex, right=-0.03em of x] (za) { $\hspace{0.6em}@{text "z\<^isub>p"}\hspace{0.6em}$ }; |
1324 \node[draw,minimum height=3.8ex, right=-0.03em of x, fill] (za) { $\hspace{0.6em}@{text "z\<^isub>p"}\hspace{0.6em}$ }; |
1321 \node[draw,minimum height=3.8ex, right=-0.03em of za] (zza) { $\hspace{2.6em}@{text "z\<^isub>s"}\hspace{2.6em}$ }; |
1325 \node[draw,minimum height=3.8ex, right=-0.03em of za, fill] (zza) { $\hspace{2.6em}@{text "z\<^isub>s"}\hspace{2.6em}$ }; |
1322 |
1326 |
1323 \draw[decoration={brace,transform={yscale=3}},decorate] |
1327 \draw[decoration={brace,transform={yscale=3}},decorate] |
1324 (x.north west) -- ($(za.north west)+(0em,0em)$) |
1328 (x.north west) -- ($(za.north west)+(0em,0em)$) |
1325 node[midway, above=0.5em]{@{text x}}; |
1329 node[midway, above=0.5em]{@{text x}}; |
1326 |
1330 |
1340 ($(zza.south east)+(0em,0ex)$) -- ($(za.south east)+(0em,0ex)$) |
1344 ($(zza.south east)+(0em,0ex)$) -- ($(za.south east)+(0em,0ex)$) |
1341 node[midway, below=0.5em]{@{text "z\<^isub>s \<in> B"}}; |
1345 node[midway, below=0.5em]{@{text "z\<^isub>s \<in> B"}}; |
1342 \end{tikzpicture}} |
1346 \end{tikzpicture}} |
1343 \\[2mm] |
1347 \\[2mm] |
1344 \scalebox{1}{ |
1348 \scalebox{1}{ |
1345 \begin{tikzpicture} |
1349 \begin{tikzpicture}[fill=gray!20] |
1346 \node[draw,minimum height=3.8ex] (xa) { $\hspace{3em}@{text "x\<^isub>p"}\hspace{3em}$ }; |
1350 \node[draw,minimum height=3.8ex, fill] (xa) { $\hspace{3em}@{text "x\<^isub>p"}\hspace{3em}$ }; |
1347 \node[draw,minimum height=3.8ex, right=-0.03em of xa] (xxa) { $\hspace{0.2em}@{text "x\<^isub>s"}\hspace{0.2em}$ }; |
1351 \node[draw,minimum height=3.8ex, right=-0.03em of xa, fill] (xxa) { $\hspace{0.2em}@{text "x\<^isub>s"}\hspace{0.2em}$ }; |
1348 \node[draw,minimum height=3.8ex, right=-0.03em of xxa] (z) { $\hspace{5em}@{text z}\hspace{5em}$ }; |
1352 \node[draw,minimum height=3.8ex, right=-0.03em of xxa, fill] (z) { $\hspace{5em}@{text z}\hspace{5em}$ }; |
1349 |
1353 |
1350 \draw[decoration={brace,transform={yscale=3}},decorate] |
1354 \draw[decoration={brace,transform={yscale=3}},decorate] |
1351 (xa.north west) -- ($(xxa.north east)+(0em,0em)$) |
1355 (xa.north west) -- ($(xxa.north east)+(0em,0em)$) |
1352 node[midway, above=0.5em]{@{text x}}; |
1356 node[midway, above=0.5em]{@{text x}}; |
1353 |
1357 |
1456 When analysing the case of @{text "x @ z"} being an element in @{term "A\<star>"} |
1460 When analysing the case of @{text "x @ z"} being an element in @{term "A\<star>"} |
1457 and @{text x} is not the empty string, we have the following picture: |
1461 and @{text x} is not the empty string, we have the following picture: |
1458 |
1462 |
1459 \begin{center} |
1463 \begin{center} |
1460 \scalebox{1}{ |
1464 \scalebox{1}{ |
1461 \begin{tikzpicture} |
1465 \begin{tikzpicture}[fill=gray!20] |
1462 \node[draw,minimum height=3.8ex] (xa) { $\hspace{4em}@{text "x\<^bsub>pmax\<^esub>"}\hspace{4em}$ }; |
1466 \node[draw,minimum height=3.8ex, fill] (xa) { $\hspace{4em}@{text "x\<^bsub>pmax\<^esub>"}\hspace{4em}$ }; |
1463 \node[draw,minimum height=3.8ex, right=-0.03em of xa] (xxa) { $\hspace{0.5em}@{text "x\<^bsub>s\<^esub>"}\hspace{0.5em}$ }; |
1467 \node[draw,minimum height=3.8ex, right=-0.03em of xa, fill] (xxa) { $\hspace{0.5em}@{text "x\<^bsub>s\<^esub>"}\hspace{0.5em}$ }; |
1464 \node[draw,minimum height=3.8ex, right=-0.03em of xxa] (za) { $\hspace{2em}@{text "z\<^isub>a"}\hspace{2em}$ }; |
1468 \node[draw,minimum height=3.8ex, right=-0.03em of xxa, fill] (za) { $\hspace{2em}@{text "z\<^isub>a"}\hspace{2em}$ }; |
1465 \node[draw,minimum height=3.8ex, right=-0.03em of za] (zb) { $\hspace{7em}@{text "z\<^isub>b"}\hspace{7em}$ }; |
1469 \node[draw,minimum height=3.8ex, right=-0.03em of za, fill] (zb) { $\hspace{7em}@{text "z\<^isub>b"}\hspace{7em}$ }; |
1466 |
1470 |
1467 \draw[decoration={brace,transform={yscale=3}},decorate] |
1471 \draw[decoration={brace,transform={yscale=3}},decorate] |
1468 (xa.north west) -- ($(xxa.north east)+(0em,0em)$) |
1472 (xa.north west) -- ($(xxa.north east)+(0em,0em)$) |
1469 node[midway, above=0.5em]{@{text x}}; |
1473 node[midway, above=0.5em]{@{text x}}; |
1470 |
1474 |
1652 boolean function @{term "nullable r"} needed in the @{const Times}-case tests |
1656 boolean function @{term "nullable r"} needed in the @{const Times}-case tests |
1653 whether a regular expression can recognise the empty string. It can be defined as |
1657 whether a regular expression can recognise the empty string. It can be defined as |
1654 follows. |
1658 follows. |
1655 |
1659 |
1656 \begin{center} |
1660 \begin{center} |
1657 \begin{tabular}{c@ {\hspace{10mm}}c} |
1661 \begin{tabular}{c} |
1658 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}} |
1662 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}} |
1659 @{thm (lhs) nullable.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(1)}\\ |
1663 @{thm (lhs) nullable.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(1)}\\ |
1660 @{thm (lhs) nullable.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(2)}\\ |
1664 @{thm (lhs) nullable.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(2)}\\ |
1661 @{thm (lhs) nullable.simps(3)} & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(3)}\\ |
1665 @{thm (lhs) nullable.simps(3)} & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(3)}\\ |
1662 \end{tabular} & |
|
1663 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1.5mm}}l@ {}} |
|
1664 @{thm (lhs) nullable.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} |
1666 @{thm (lhs) nullable.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} |
1665 & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\ |
1667 & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\ |
1666 @{thm (lhs) nullable.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} |
1668 @{thm (lhs) nullable.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} |
1667 & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\ |
1669 & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\ |
1668 @{thm (lhs) nullable.simps(6)} & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(6)}\\ |
1670 @{thm (lhs) nullable.simps(6)} & @{text "\<equiv>"} & @{thm (rhs) nullable.simps(6)}\\ |
1683 \end{equation} |
1685 \end{equation} |
1684 |
1686 |
1685 \noindent |
1687 \noindent |
1686 The importance of this fact in the context of the Myhill-Nerode theorem is that |
1688 The importance of this fact in the context of the Myhill-Nerode theorem is that |
1687 we can use \eqref{mhders} and \eqref{Dersders} in order to |
1689 we can use \eqref{mhders} and \eqref{Dersders} in order to |
1688 establish that @{term "x \<approx>(lang r) y"} is equivalent to |
1690 establish that |
1689 \mbox{@{term "lang (ders x r) = lang (ders y r)"}}. Hence |
1691 |
|
1692 \begin{center} |
|
1693 @{term "x \<approx>(lang r) y"} \hspace{4mm}if and only if\hspace{4mm} |
|
1694 @{term "lang (ders x r) = lang (ders y r)"}. |
|
1695 \end{center} |
|
1696 |
|
1697 \noindent |
|
1698 holds and hence |
1690 |
1699 |
1691 \begin{equation} |
1700 \begin{equation} |
1692 @{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "ders x r = ders y r"} |
1701 @{term "x \<approx>(lang r) y"}\hspace{4mm}\mbox{provided}\hspace{4mm}@{term "ders x r = ders y r"} |
1693 \end{equation} |
1702 \end{equation} |
1694 |
1703 |
1695 |
1704 |
1696 \noindent |
1705 \noindent |
1697 which means the right-hand side (seen as a relation) refines the Myhill-Nerode |
1706 This means the right-hand side (seen as a relation) refines the Myhill-Nerode |
1698 relation. Consequently, we can use @{text |
1707 relation. Consequently, we can use @{text |
1699 "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as a |
1708 "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as a |
1700 tagging-relation. However, in order to be useful for the second part of the |
1709 tagging-relation. However, in order to be useful for the second part of the |
1701 Myhill-Nerode theorem, we have to be able to establish that for the |
1710 Myhill-Nerode theorem, we have to be able to establish that for the |
1702 corresponding language there are only finitely many derivatives---thus |
1711 corresponding language there are only finitely many derivatives---thus |
1838 These two properties confirm the observation made earlier |
1847 These two properties confirm the observation made earlier |
1839 that by using sets, partial derivatives have the @{text "ACI"}-identities |
1848 that by using sets, partial derivatives have the @{text "ACI"}-identities |
1840 of derivatives already built in. |
1849 of derivatives already built in. |
1841 |
1850 |
1842 Antimirov also proved that for every language and regular expression |
1851 Antimirov also proved that for every language and regular expression |
1843 there are only finitely many partial derivatives, whereby the partial |
1852 there are only finitely many partial derivatives, whereby the set of partial |
1844 derivatives of @{text r} w.r.t.~a language @{text A} is defined as |
1853 derivatives of @{text r} w.r.t.~a language @{text A} is defined as |
1845 |
1854 |
1846 \begin{equation}\label{Pdersdef} |
1855 \begin{equation}\label{Pdersdef} |
1847 @{thm pders_lang_def} |
1856 @{thm pders_lang_def} |
1848 \end{equation} |
1857 \end{equation} |
1883 |
1892 |
1884 \noindent |
1893 \noindent |
1885 and for all languages @{text "A"}, @{term "pders_lang A r"} is a subset of |
1894 and for all languages @{text "A"}, @{term "pders_lang A r"} is a subset of |
1886 @{term "pders_lang UNIV r"}. Since we follow Antimirov's proof quite |
1895 @{term "pders_lang UNIV r"}. Since we follow Antimirov's proof quite |
1887 closely in our formalisation (only the last two cases of |
1896 closely in our formalisation (only the last two cases of |
1888 \eqref{pdersunivone} involve some non-routine induction argument), we omit |
1897 \eqref{pdersunivone} involve some non-routine induction arguments), we omit |
1889 the details. |
1898 the details. |
1890 |
1899 |
1891 Let us now return to our proof about the second direction in the Myhill-Nerode |
1900 Let us now return to our proof for the second direction in the Myhill-Nerode |
1892 theorem. The point of the above calculations is to use |
1901 theorem. The point of the above calculations is to use |
1893 @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation. |
1902 @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"} as tagging-relation. |
1894 |
1903 |
1895 |
1904 |
1896 \begin{proof}[Proof of Theorem~\ref{myhillnerodetwo} (second version)] |
1905 \begin{proof}[Proof of Theorem~\ref{myhillnerodetwo} (second version)] |
1913 @{thm pders_lang_def[where A="UNIV", simplified]} |
1922 @{thm pders_lang_def[where A="UNIV", simplified]} |
1914 \end{center} |
1923 \end{center} |
1915 |
1924 |
1916 \noindent |
1925 \noindent |
1917 Now the range of @{term "\<lambda>x. pders x r"} is a subset of @{term "Pow (pders_lang UNIV r)"}, |
1926 Now the range of @{term "\<lambda>x. pders x r"} is a subset of @{term "Pow (pders_lang UNIV r)"}, |
1918 which we know is finite by Theorem~\ref{antimirov}. This means there |
1927 which we know is finite by Theorem~\ref{antimirov}. Consequently there |
1919 are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"}, |
1928 are only finitely many equivalence classes of @{text "\<^raw:$\threesim$>\<^bsub>(\<lambda>x. ders x r)\<^esub>"}, |
1920 which refines @{term "\<approx>(lang r)"}, and consequently we can again conclude the |
1929 which refines @{term "\<approx>(lang r)"}, and therefore we can again conclude the |
1921 second part of the Myhill-Nerode theorem. |
1930 second part of the Myhill-Nerode theorem. |
1922 \end{proof} |
1931 \end{proof} |
1923 *} |
1932 *} |
1924 |
1933 |
1925 section {* Closure Properties of Regular Languages *} |
1934 section {* Closure Properties of Regular Languages\label{closure} *} |
1926 |
1935 |
1927 text {* |
1936 text {* |
1928 \noindent |
1937 \noindent |
1929 The beauty of regular languages is that they are closed under many set |
1938 The beauty of regular languages is that they are closed under many set |
1930 operations. Closure under union, concatenation and Kleene-star are trivial |
1939 operations. Closure under union, concatenation and Kleene-star are trivial |
2049 construct a regular expression for the complement language by direct |
2059 construct a regular expression for the complement language by direct |
2050 means. However the existence of such a regular expression can be easily |
2060 means. However the existence of such a regular expression can be easily |
2051 proved using the Myhill-Nerode theorem. |
2061 proved using the Myhill-Nerode theorem. |
2052 |
2062 |
2053 Our insistence on regular expressions for proving the Myhill-Nerode theorem |
2063 Our insistence on regular expressions for proving the Myhill-Nerode theorem |
2054 arose from the limitations of HOL, on which the popular theorem provers HOL4, |
2064 arose from the limitations of HOL, used in the popular theorem provers HOL4, |
2055 HOLlight and Isabelle/HOL are based. In order to guarantee consistency, |
2065 HOLlight and Isabelle/HOL. In order to guarantee consistency, |
2056 formalisations can extend HOL with definitions that introduce a new concept in |
2066 formalisations in HOL can only extend the logic with definitions that introduce a new concept in |
2057 only terms of already existing notions. A convenient definition for automata |
2067 terms of already existing notions. A convenient definition for automata |
2058 (based on graphs) uses a polymorphic type for the state nodes. This allows |
2068 (based on graphs) uses a polymorphic type for the state nodes. This allows |
2059 us to use the standard operation for disjoint union whenever we need to compose two |
2069 us to use the standard operation for disjoint union whenever we need to compose two |
2060 automata. Unfortunately, we cannot use such a polymorphic definition |
2070 automata. Unfortunately, we cannot use such a polymorphic definition |
2061 in HOL as part of the definition for regularity of a language (a predicate |
2071 in HOL as part of the definition for regularity of a language (a predicate |
2062 over set of strings). Consider the following attempt: |
2072 over set of strings). Consider the following attempt: |
2063 |
2073 |
2064 \begin{center} |
2074 \begin{center} |
2065 @{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_finite_automata (M) \<and> \<calL>(M) = A"} |
2075 @{text "is_regular A \<equiv> \<exists>M(\<alpha>). is_dfa (M) \<and> \<calL>(M) = A"} |
2066 \end{center} |
2076 \end{center} |
2067 |
2077 |
2068 \noindent |
2078 \noindent |
2069 In this definifion, the definiens is polymorphic in the type of the automata |
2079 In this definifion, the definiens is polymorphic in the type of the automata |
2070 @{text "M"} (indicated by the @{text "\<alpha>"}), but the definiendum @{text |
2080 @{text "M"} (indicated by dependency on @{text "\<alpha>"}), but the definiendum |
2071 "is_regular"} is not. Such definitions are excluded in HOL, because they can |
2081 @{text "is_regular"} is not. Such definitions are excluded in HOL, because |
2072 lead easily to inconsistencies (see \cite{PittsHOL4} for a simple |
2082 they can lead easily to inconsistencies (see \cite{PittsHOL4} for a simple |
2073 example). Also HOL does not contain type-quantifiers which would allow us to |
2083 example). Also HOL does not contain type-quantifiers which would allow us to |
2074 get rid of the polymorphism by quantifying over the type-variable @{text |
2084 get rid of the polymorphism by quantifying over the type-variable @{text |
2075 "\<alpha>"}. Therefore when defining regularity in terms of automata, the only |
2085 "\<alpha>"}. Therefore when defining regularity in terms of automata, the only |
2076 natural way out in HOL is to resort to state nodes with an identity, for |
2086 natural way out in HOL is to resort to state nodes with an identity, for |
2077 example a natural number. Unfortunatly, the consequence is that we have to |
2087 example a natural number. Unfortunatly, the consequence is that we have to |
2078 be careful when combining two automata so that there is no clash between two |
2088 be careful when combining two automata so that there is no clash between two |
2079 such states. This makes formalisations quite fiddly and rather |
2089 such states. This makes formalisations quite fiddly and rather |
2080 unpleasant. Regular expressions proved much more convenient for reasoning in |
2090 unpleasant. Regular expressions proved much more convenient for reasoning in |
2081 HOL since they can be defined as inductive datatype and a reasoning |
2091 HOL since they can be defined as inductive datatype and a reasoning |
2082 infrastructure comes for free. We showed they can be used for establishing |
2092 infrastructure comes for free. The definition of regularity in terms of |
2083 the central result in regular language theory---the Myhill-Nerode theorem. |
2093 regular expressions poses no problem at all for HOL. We showed in this |
|
2094 paper that they can be used for establishing the central result in regular |
|
2095 language theory---the Myhill-Nerode theorem. |
2084 |
2096 |
2085 While regular expressions are convenient, they have some limitations. One is |
2097 While regular expressions are convenient, they have some limitations. One is |
2086 that there seems to be no method of calculating a minimal regular expression |
2098 that there seems to be no method of calculating a minimal regular expression |
2087 (for example in terms of length) for a regular language, like there is for |
2099 (for example in terms of length) for a regular language, like there is for |
2088 automata. On the other hand, efficient regular expression matching, without |
2100 automata. On the other hand, efficient regular expression matching, without |
2139 Theorem~\ref{antimirov}). However, since partial derivatives use sets of |
2151 Theorem~\ref{antimirov}). However, since partial derivatives use sets of |
2140 regular expressions, one needs to carefully analyse whether the resulting |
2152 regular expressions, one needs to carefully analyse whether the resulting |
2141 algorithm is still executable. Given the existing infrastructure for |
2153 algorithm is still executable. Given the existing infrastructure for |
2142 executable sets in Isabelle/HOL, it should. |
2154 executable sets in Isabelle/HOL, it should. |
2143 |
2155 |
2144 |
|
2145 Our formalisation of the Myhill-Nerode theorem consists of 780 lines of |
2156 Our formalisation of the Myhill-Nerode theorem consists of 780 lines of |
2146 Isabelle/Isar code for the first direction and 460 for the second (the one |
2157 Isabelle/Isar code for the first direction and 460 for the second (the one |
2147 based on tagging functions), plus around 300 lines of standard material |
2158 based on tagging-functions), plus around 300 lines of standard material |
2148 about regular languages. The formalisation of derivatives and partial |
2159 about regular languages. The formalisation of derivatives and partial |
2149 derivatives shown in Section~\ref{derivatives} consists of 390 lines of |
2160 derivatives shown in Section~\ref{derivatives} consists of 390 lines of |
2150 code. The algorithm for solving equational systems, which we used in the |
2161 code. The closure properties in Section~\ref{closure} can be established in |
2151 first direction, is conceptually relatively simple. Still the use of sets |
2162 190 lines of code. The algorithm for solving equational systems, which we |
2152 over which the algorithm operates means it is not as easy to formalise as |
2163 used in the first direction, is conceptually relatively simple. Still the |
2153 one might hope. However, it seems sets cannot be avoided since the `input' |
2164 use of sets over which the algorithm operates means it is not as easy to |
2154 of the algorithm consists of equivalence classes and we cannot see how to |
2165 formalise as one might hope. However, it seems sets cannot be avoided since |
2155 reformulate the theory so that we can use lists. Lists would be much easier |
2166 the `input' of the algorithm consists of equivalence classes and we cannot |
2156 to reason about, since we can define functions over them by recursion. For |
2167 see how to reformulate the theory so that we can use lists. Lists would be |
2157 sets we have to use set-comprehensions, which is slightly unwieldy. |
2168 much easier to reason about, since we can define functions over them by |
|
2169 recursion. For sets we have to use set-comprehensions, which is slightly |
|
2170 unwieldy. |
2158 |
2171 |
2159 While our formalisation might appear large, it should be seen |
2172 While our formalisation might appear large, it should be seen |
2160 in the context of the work done by Constable at al \cite{Constable00} who |
2173 in the context of the work done by Constable at al \cite{Constable00} who |
2161 formalised the Myhill-Nerode theorem in Nuprl using automata. They write |
2174 formalised the Myhill-Nerode theorem in Nuprl using automata. They write |
2162 that their four-member team needed something on the magnitude of 18 months |
2175 that their four-member team needed something on the magnitude of 18 months |