(*<*)theory Paperimports "../Myhill" "LaTeXsugar" begindeclare [[show_question_marks = false]]consts REL :: "(string \<times> string) \<Rightarrow> bool" UPLUS :: "'a set \<Rightarrow> 'a set \<Rightarrow> (nat \<times> 'a) set"abbreviation "EClass x R \<equiv> R `` {x}"abbreviation "append_rexp2 r_itm r \<equiv> append_rexp r r_itm"notation (latex output) str_eq_rel ("\<approx>\<^bsub>_\<^esub>") and str_eq ("_ \<approx>\<^bsub>_\<^esub> _") and Seq (infixr "\<cdot>" 100) and Star ("_\<^bsup>\<star>\<^esup>") and pow ("_\<^bsup>_\<^esup>" [100, 100] 100) and Suc ("_+1" [100] 100) and quotient ("_ \<^raw:\ensuremath{\!\sslash\!}> _" [90, 90] 90) and REL ("\<approx>") and UPLUS ("_ \<^raw:\ensuremath{\uplus}> _" [90, 90] 90) and L ("\<^raw:\ensuremath{\cal{L}}>'(_')" [0] 101) and Lam ("\<lambda>'(_')" [100] 100) and Trn ("'(_, _')" [100, 100] 100) and EClass ("\<lbrakk>_\<rbrakk>\<^bsub>_\<^esub>" [100, 100] 100) and transition ("_ \<^raw:\ensuremath{\stackrel{\text{>_\<^raw:}}{\Longmapsto}}> _" [100, 100, 100] 100) and Setalt ("\<^raw:\ensuremath{\bigplus}>_" [1000] 999) and append_rexp2 ("_ \<^raw:\ensuremath{\triangleleft}> _" [100, 100] 100) and append_rhs_rexp ("_ \<^raw:\ensuremath{\triangleleft}> _" [100, 100] 50)(*>*)section {* Introduction *}text {* Regular languages are an important and well-understood subject in Computer Science, with many beautiful theorems and many useful algorithms. There is a wide range of textbooks on this subject, many of which are aimed at students and contain very detailed ``pencil-and-paper'' proofs (e.g.~\cite{Kozen97}). It seems natural to exercise theorem provers by formalising the theorems and by verifying formally the algorithms. There is however a problem: the typical approach to regular languages is to introduce finite automata and then define everything in terms of them. For example, a regular language is normally defined as one whose strings are recognised by a finite deterministic automaton. This approach has many benefits. Among them is the fact that it is easy to convince oneself that regular languages are closed under complementation: one just has to exchange the accepting and non-accepting states in the corresponding automaton to obtain an automaton for the complement language. The problem, however, lies with formalising such reasoning in a HOL-based theorem prover, in our case Isabelle/HOL. Automata are build up from states and transitions that need to be represented as graphs, matrices or functions, none of which can be defined as inductive datatype. In case of graphs and matrices, this means we have to build our own reasoning infrastructure for them, as neither Isabelle/HOL nor HOL4 nor HOLlight support them with libraries. Even worse, reasoning about graphs and matrices can be a real hassle in HOL-based theorem provers. Consider for example the operation of sequencing two automata, say $A_1$ and $A_2$, by connecting the accepting states of $A_1$ to the initial state of $A_2$: \begin{center} \begin{tabular}{ccc} \begin{tikzpicture}[scale=0.8] %\draw[step=2mm] (-1,-1) grid (1,1); \draw[rounded corners=1mm, very thick] (-1.0,-0.3) rectangle (-0.2,0.3); \draw[rounded corners=1mm, very thick] ( 0.2,-0.3) rectangle ( 1.0,0.3); \node (A) at (-1.0,0.0) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (B) at ( 0.2,0.0) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (C) at (-0.2, 0.13) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (D) at (-0.2,-0.13) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (E) at (1.0, 0.2) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (F) at (1.0,-0.0) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (G) at (1.0,-0.2) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \draw (-0.6,0.0) node {\footnotesize$A_1$}; \draw ( 0.6,0.0) node {\footnotesize$A_2$}; \end{tikzpicture} & \raisebox{1.1mm}{\bf\Large$\;\;\;\Rightarrow\,\;\;$} & \begin{tikzpicture}[scale=0.8] %\draw[step=2mm] (-1,-1) grid (1,1); \draw[rounded corners=1mm, very thick] (-1.0,-0.3) rectangle (-0.2,0.3); \draw[rounded corners=1mm, very thick] ( 0.2,-0.3) rectangle ( 1.0,0.3); \node (A) at (-1.0,0.0) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (B) at ( 0.2,0.0) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (C) at (-0.2, 0.13) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (D) at (-0.2,-0.13) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (E) at (1.0, 0.2) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (F) at (1.0,-0.0) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \node (G) at (1.0,-0.2) [circle, very thick, draw, fill=white, inner sep=0.4mm] {}; \draw (C) to [very thick, bend left=45] (B); \draw (D) to [very thick, bend right=45] (B); \draw (-0.6,0.0) node {\footnotesize$A_1$}; \draw ( 0.6,0.0) node {\footnotesize$A_2$}; \end{tikzpicture} \end{tabular} \end{center} \noindent On ``paper'' we can define the corresponding graph in terms of the disjoint union of the state nodes. Unfortunately in HOL, the standard definition for disjoint union, namely % \begin{equation}\label{disjointunion} @{term "UPLUS A\<^isub>1 A\<^isub>2 \<equiv> {(1, x) | x. x \<in> A\<^isub>1} \<union> {(2, y) | y. y \<in> A\<^isub>2}"} \end{equation} \noindent changes the type---the disjoint union is not a set, but a set of pairs. Using this definition for disjoint unions means we do not have a single type for automata and hence will not be able to state certain properties about \emph{all} automata, since there is no type quantification available in HOL. An alternative, which provides us with a single type for automata, is to give every state node an identity, for example a natural number, and then be careful to rename these identities apart whenever connecting two automata. This results in clunky proofs establishing that properties are invariant under renaming. Similarly, connecting two automata represented as matrices results in very adhoc constructions, which are not pleasant to reason about. Functions are much better supported in Isabelle/HOL, but they still lead to similar problems as with graphs. Composing, for example, two non-deterministic automata in parallel requires also the formalisation of disjoint unions. Nipkow \cite{Nipkow98} dismisses for this the option of using identities, because it leads according to him to ``messy proofs''. He opts for a variant of \eqref{disjointunion} using bit lists, but writes \begin{quote} \it% \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}} `` & All lemmas appear obvious given a picture of the composition of automata\ldots Yet their proofs require a painful amount of detail.'' \end{tabular} \end{quote} \noindent and \begin{quote} \it% \begin{tabular}{@ {}l@ {}p{0.88\textwidth}@ {}} `` & If the reader finds the above treatment in terms of bit lists revoltingly concrete, I cannot disagree. A more abstract approach is clearly desirable.'' \end{tabular} \end{quote} \noindent Moreover, it is not so clear how to conveniently impose a finiteness condition upon functions in order to represent \emph{finite} automata. The best is probably to resort to more advanced reasoning frameworks, such as \emph{locales} or \emph{type classes}, which are not avaiable in \emph{all} HOL-based theorem provers. Because of these problems to do with representing automata, there seems to be no substantial formalisation of automata theory and regular languages carried out in HOL-based theorem provers. Nipkow establishes in \cite{Nipkow98} the link between regular expressions and automata in the context of lexing. The only larger formalisations of automata theory are carried out in Nuprl \cite{Constable00} and in Coq (for example \cite{Filliatre97}). In this paper, we will not attempt to formalise automata theory in Isabelle/HOL, but take a completely different approach to regular languages. Instead of defining a regular language as one where there exists an automaton that recognises all strings of the language, we define a regular language as: \begin{definition} A language @{text A} is \emph{regular}, provided there is a regular expression that matches all strings of @{text "A"}. \end{definition} \noindent The reason is that regular expressions, unlike graphs, matrices and functons, can be easily defined as inductive datatype. Consequently a corresponding reasoning infrastructure comes for free. This has recently been exploited in HOL4 with a formalisation of regular expression matching based on derivatives \cite{OwensSlind08} and with an equivalence checker for regular expressions in Isabelle/HOL \cite{KraussNipkow11}. The purpose of this paper is to show that a central result about regular languages---the Myhill-Nerode theorem---can be recreated by only using regular expressions. This theorem gives necessary and sufficient conditions for when a language is regular. As a corollary of this theorem we can easily establish the usual closure properties, including complementation, for regular languages.\smallskip \noindent {\bf Contributions:} There is an extensive literature on regular languages. To our knowledge, our proof of the Myhill-Nerode theorem is the first that is based on regular expressions, only. We prove the part of this theorem stating that a regular expression has only finitely many partitions using certain tagging-functions. Again to our best knowledge, these tagging functions have not been used before to establish the Myhill-Nerode theorem.*}section {* Preliminaries *}text {* Strings in Isabelle/HOL are lists of characters with the \emph{empty string} being represented by the empty list, written @{term "[]"}. \emph{Languages} are sets of strings. The language containing all strings is written in Isabelle/HOL as @{term "UNIV::string set"}. The concatenation of two languages is written @{term "A ;; B"} and a language raised to the power @{text n} is written @{term "A \<up> n"}. They are defined as usual \begin{center} @{thm Seq_def[THEN eq_reflection, where A1="A" and B1="B"]} \hspace{7mm} @{thm pow.simps(1)[THEN eq_reflection, where A1="A"]} \hspace{7mm} @{thm pow.simps(2)[THEN eq_reflection, where A1="A" and n1="n"]} \end{center} \noindent where @{text "@"} is the usual list-append operation. The Kleene-star of a language @{text A} is defined as the union over all powers, namely @{thm Star_def}. In the paper we will make use of the following properties of these constructions. \begin{proposition}\label{langprops}\mbox{}\\ \begin{tabular}{@ {}ll} (i) & @{thm star_cases} \\ (ii) & @{thm[mode=IfThen] pow_length}\\ (iii) & @{thm seq_Union_left} \\ \end{tabular} \end{proposition} \noindent In @{text "(ii)"} we use the notation @{term "length s"} for the length of a string. This property states that if @{term "[] \<notin> A"} then the lengths of the strings in @{term "A \<up> (Suc n)"} must be longer than @{text n}. We omit the proofs for these properties, but invite the reader to consult our formalisation.\footnote{Available at ???} The notation in Isabelle/HOL for the quotient of a language @{text A} according to an equivalence relation @{term REL} is @{term "A // REL"}. We will write @{text "\<lbrakk>x\<rbrakk>\<^isub>\<approx>"} for the equivalence class defined as @{text "{y | y \<approx> x}"}. Central to our proof will be the solution of equational systems involving equivalence classes of languages. For this we will use Arden's lemma \cite{Brzozowski64} which solves equations of the form @{term "X = A ;; X \<union> B"} provided @{term "[] \<notin> A"}. However we will need the following ``reverse'' version of Arden's lemma. \begin{lemma}[Reverse Arden's Lemma]\label{arden}\mbox{}\\ If @{thm (prem 1) arden} then @{thm (lhs) arden} has the unique solution @{thm (rhs) arden}. \end{lemma} \begin{proof} For the right-to-left direction we assume @{thm (rhs) arden} and show that @{thm (lhs) arden} holds. From Prop.~\ref{langprops}@{text "(i)"} we have @{term "A\<star> = {[]} \<union> A ;; A\<star>"}, which is equal to @{term "A\<star> = {[]} \<union> A\<star> ;; A"}. Adding @{text B} to both sides gives @{term "B ;; A\<star> = B ;; ({[]} \<union> A\<star> ;; A)"}, whose right-hand side is equal to @{term "(B ;; A\<star>) ;; A \<union> B"}. This completes this direction. For the other direction we assume @{thm (lhs) arden}. By a simple induction on @{text n}, we can establish the property \begin{center} @{text "(*)"}\hspace{5mm} @{thm (concl) arden_helper} \end{center} \noindent Using this property we can show that @{term "B ;; (A \<up> n) \<subseteq> X"} holds for all @{text n}. From this we can infer @{term "B ;; A\<star> \<subseteq> X"} using the definition of @{text "\<star>"}. For the inclusion in the other direction we assume a string @{text s} with length @{text k} is element in @{text X}. Since @{thm (prem 1) arden} we know by Prop.~\ref{langprops}@{text "(ii)"} that @{term "s \<notin> X ;; (A \<up> Suc k)"} since its length is only @{text k} (the strings in @{term "X ;; (A \<up> Suc k)"} are all longer). From @{text "(*)"} it follows then that @{term s} must be element in @{term "(\<Union>m\<in>{0..k}. B ;; (A \<up> m))"}. This in turn implies that @{term s} is in @{term "(\<Union>n. B ;; (A \<up> n))"}. Using Prop.~\ref{langprops}@{text "(iii)"} this is equal to @{term "B ;; A\<star>"}, as we needed to show.\qed \end{proof} \noindent Regular expressions are defined as the inductive datatype \begin{center} @{text r} @{text "::="} @{term NULL}\hspace{1.5mm}@{text"|"}\hspace{1.5mm} @{term EMPTY}\hspace{1.5mm}@{text"|"}\hspace{1.5mm} @{term "CHAR c"}\hspace{1.5mm}@{text"|"}\hspace{1.5mm} @{term "SEQ r r"}\hspace{1.5mm}@{text"|"}\hspace{1.5mm} @{term "ALT r r"}\hspace{1.5mm}@{text"|"}\hspace{1.5mm} @{term "STAR r"} \end{center} \noindent and the language matched by a regular expression is defined as \begin{center} \begin{tabular}{c@ {\hspace{10mm}}c} \begin{tabular}{rcl} @{thm (lhs) L_rexp.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) L_rexp.simps(1)}\\ @{thm (lhs) L_rexp.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) L_rexp.simps(2)}\\ @{thm (lhs) L_rexp.simps(3)[where c="c"]} & @{text "\<equiv>"} & @{thm (rhs) L_rexp.simps(3)[where c="c"]}\\ \end{tabular} & \begin{tabular}{rcl} @{thm (lhs) L_rexp.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} & @{text "\<equiv>"} & @{thm (rhs) L_rexp.simps(4)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\ @{thm (lhs) L_rexp.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]} & @{text "\<equiv>"} & @{thm (rhs) L_rexp.simps(5)[where ?r1.0="r\<^isub>1" and ?r2.0="r\<^isub>2"]}\\ @{thm (lhs) L_rexp.simps(6)[where r="r"]} & @{text "\<equiv>"} & @{thm (rhs) L_rexp.simps(6)[where r="r"]}\\ \end{tabular} \end{tabular} \end{center} Given a finite set of regular expressions @{text rs}, we will make use of the operation of generating a regular expression that matches all languages of @{text rs}. We only need to know the existence of such a regular expression and therefore we use Isabelle/HOL's @{const "fold_graph"} and Hilbert's @{text "\<epsilon>"} to define @{term "\<Uplus>rs"}. This operation, roughly speaking, folds @{const ALT} over the set @{text rs} with @{const NULL} for the empty set. We can prove that for a finite set @{text rs} \begin{center} @{thm (lhs) folds_alt_simp} @{text "= \<Union> (\<calL> ` rs)"} \end{center} \noindent holds, whereby @{text "\<calL> ` rs"} stands for the image of the set @{text rs} under function @{text "\<calL>"}.*}section {* The Myhill-Nerode Theorem, First Part *}text {* The key definition in the Myhill-Nerode theorem is the \emph{Myhill-Nerode relation}, which states that w.r.t.~a language two strings are related, provided there is no distinguishing extension in this language. This can be defined as: \begin{definition}[Myhill-Nerode Relation]\mbox{}\\ @{thm str_eq_def[simplified str_eq_rel_def Pair_Collect]} \end{definition} \noindent It is easy to see that @{term "\<approx>A"} is an equivalence relation, which partitions the set of all strings, @{text "UNIV"}, into a set of disjoint equivalence classes. To illustrate this quotient construction, let us give an example: consider the regular language containing just the string @{text "[c]"}. The relation @{term "\<approx>({[c]})"} partitions @{text UNIV} into three equivalence classes @{text "X\<^isub>1"}, @{text "X\<^isub>2"} and @{text "X\<^isub>3"} as follows \begin{center} @{text "X\<^isub>1 = {[]}"}\hspace{5mm} @{text "X\<^isub>2 = {[c]}"}\hspace{5mm} @{text "X\<^isub>3 = UNIV - {[], [c]}"} \end{center} One direction of the Myhill-Nerode theorem establishes that if there are finitely many equivalence classes, like in the example above, then the language is regular. In our setting we therefore have to show: \begin{theorem}\label{myhillnerodeone} @{thm[mode=IfThen] Myhill_Nerode1} \end{theorem} \noindent To prove this theorem, we first define the set @{term "finals A"} as those equivalence classes from @{term "UNIV // \<approx>A"} that contain strings of @{text A}, namely % \begin{equation} @{thm finals_def} \end{equation} \noindent In our running example, @{text "X\<^isub>2"} is the only equivalence class in @{term "finals {[c]}"}. It is straightforward to show that in general @{thm lang_is_union_of_finals} and @{thm finals_in_partitions} hold. Therefore if we know that there exists a regular expression for every equivalence class in \mbox{@{term "finals A"}} (which by assumption must be a finite set), then we can use @{text "\<bigplus>"} to obtain a regular expression that matches every string in @{text A}. Our proof of Thm.~\ref{myhillnerodeone} relies on a method that can calculate a regular expression for \emph{every} equivalence class, not just the ones in @{term "finals A"}. We first define the notion of \emph{one-character-transition} between two equivalence classes % \begin{equation} @{thm transition_def} \end{equation} \noindent which means that if we concatenate the character @{text c} to the end of all strings in the equivalence class @{text Y}, we obtain a subset of @{text X}. Note that we do not define an automaton here, we merely relate two sets (with respect to a character). In our concrete example we have @{term "X\<^isub>1 \<Turnstile>c\<Rightarrow> X\<^isub>2"}, @{term "X\<^isub>1 \<Turnstile>d\<Rightarrow> X\<^isub>3"} with @{text d} being any other character than @{text c}, and @{term "X\<^isub>3 \<Turnstile>d\<Rightarrow> X\<^isub>3"} for any @{text d}. Next we build an \emph{initial} equational system that contains an equation for each equivalence class. Suppose we have the equivalence classes @{text "X\<^isub>1,\<dots>,X\<^isub>n"}, there must be one and only one that contains the empty string @{text "[]"} (since equivalence classes are disjoint). Let us assume @{text "[] \<in> X\<^isub>1"}. We build the following equational system \begin{center} \begin{tabular}{rcl} @{text "X\<^isub>1"} & @{text "="} & @{text "(Y\<^isub>1\<^isub>1, CHAR c\<^isub>1\<^isub>1) + \<dots> + (Y\<^isub>1\<^isub>p, CHAR c\<^isub>1\<^isub>p) + \<lambda>(EMPTY)"} \\ @{text "X\<^isub>2"} & @{text "="} & @{text "(Y\<^isub>2\<^isub>1, CHAR c\<^isub>2\<^isub>1) + \<dots> + (Y\<^isub>2\<^isub>o, CHAR c\<^isub>2\<^isub>o)"} \\ & $\vdots$ \\ @{text "X\<^isub>n"} & @{text "="} & @{text "(Y\<^isub>n\<^isub>1, CHAR c\<^isub>n\<^isub>1) + \<dots> + (Y\<^isub>n\<^isub>q, CHAR c\<^isub>n\<^isub>q)"}\\ \end{tabular} \end{center} \noindent where the terms @{text "(Y\<^isub>i\<^isub>j, CHAR c\<^isub>i\<^isub>j)"} stand for all transitions @{term "Y\<^isub>i\<^isub>j \<Turnstile>c\<^isub>i\<^isub>j\<Rightarrow> X\<^isub>i"}. There can only be finitely many such terms in a right-hand side since there are only finitely many equivalence classes and only finitely many characters. The term @{text "\<lambda>(EMPTY)"} in the first equation acts as a marker for the equivalence class containing @{text "[]"}.\footnote{Note that we mark, roughly speaking, the single ``initial'' state in the equational system, which is different from the method by Brzozowski \cite{Brzozowski64}, where he marks the ``terminal'' states. We are forced to set up the equational system in our way, because the Myhill-Nerode relation determines the ``direction'' of the transitions. The successor ``state'' of an equivalence class @{text Y} can be reached by adding characters to the end of @{text Y}. This is also the reason why we have to use our reverse version of Arden's lemma.} Overloading the function @{text \<calL>} for the two kinds of terms in the equational system, we have \begin{center} @{text "\<calL>(Y, r) \<equiv>"} % @{thm (rhs) L_rhs_item.simps(2)[where X="Y" and r="r", THEN eq_reflection]}\hspace{10mm} @{thm L_rhs_item.simps(1)[where r="r", THEN eq_reflection]} \end{center} \noindent and we can prove for @{text "X\<^isub>2\<^isub>.\<^isub>.\<^isub>n"} that the following equations % \begin{equation}\label{inv1} @{text "X\<^isub>i = \<calL>(Y\<^isub>i\<^isub>1, CHAR c\<^isub>i\<^isub>1) \<union> \<dots> \<union> \<calL>(Y\<^isub>i\<^isub>q, CHAR c\<^isub>i\<^isub>q)"}. \end{equation} \noindent hold. Similarly for @{text "X\<^isub>1"} we can show the following equation % \begin{equation}\label{inv2} @{text "X\<^isub>1 = \<calL>(Y\<^isub>i\<^isub>1, CHAR c\<^isub>i\<^isub>1) \<union> \<dots> \<union> \<calL>(Y\<^isub>i\<^isub>p, CHAR c\<^isub>i\<^isub>p) \<union> \<calL>(\<lambda>(EMPTY))"}. \end{equation} \noindent The reason for adding the @{text \<lambda>}-marker to our initial equational system is to obtain this equation: it only holds with the marker, since none of the other terms contain the empty string. Our representation for the equations in Isabelle/HOL are pairs, where the first component is an equivalence class and the second component is a set of terms. Given a set of equivalence classes @{text CS}, our initial equational system @{term "Init CS"} is thus formally defined as % \begin{equation}\label{initcs} \mbox{\begin{tabular}{rcl} @{thm (lhs) Init_rhs_def} & @{text "\<equiv>"} & @{text "if"}~@{term "[] \<in> X"}\\ & & @{text "then"}~@{term "{Trn Y (CHAR c) | Y c. Y \<in> CS \<and> Y \<Turnstile>c\<Rightarrow> X} \<union> {Lam EMPTY}"}\\ & & @{text "else"}~@{term "{Trn Y (CHAR c)| Y c. Y \<in> CS \<and> Y \<Turnstile>c\<Rightarrow> X}"}\\ @{thm (lhs) Init_def} & @{text "\<equiv>"} & @{thm (rhs) Init_def} \end{tabular}} \end{equation} \noindent Because we use sets of terms for representing the right-hand sides of equations, we can prove \eqref{inv1} and \eqref{inv2} more concisely as % \begin{lemma}\label{inv} If @{thm (prem 1) test} then @{text "X = \<Union> \<calL> ` rhs"}. \end{lemma} \noindent Our proof of Thm.~\ref{myhillnerodeone} will proceed by transforming the initial equational system into one in \emph{solved form} maintaining the invariant in Lemma \ref{inv}. From the solved form we will be able to read off the regular expressions. In order to transform an equational system into solved form, we have two operations: one that takes an equation of the form @{text "X = rhs"} and removes the recursive occurences of @{text X} in the @{text rhs} using our variant of Arden's Lemma. The other operation takes an equation @{text "X = rhs"} and substitutes @{text X} throughout the rest of the equational system adjusting the remaining regular expressions approriately. To define this adjustment we define the \emph{append-operation} \begin{center} @{thm append_rexp.simps(2)[where X="Y" and r="r\<^isub>1" and rexp="r\<^isub>2", THEN eq_reflection]}\hspace{10mm} @{thm append_rexp.simps(1)[where r="r\<^isub>1" and rexp="r\<^isub>2", THEN eq_reflection]} \end{center} \noindent which we also lift to entire right-hand sides of equations, written as @{thm (lhs) append_rhs_rexp_def[where rexp="r"]}. With this we can define the \emph{arden-operation} for an equation of the form @{text "X = rhs"} as: \begin{center} \begin{tabular}{rc@ {\hspace{2mm}}r@ {\hspace{1mm}}l} @{thm (lhs) Arden_def} & @{text "\<equiv>"}~~\mbox{} & \multicolumn{2}{@ {\hspace{-2mm}}l}{@{text "let"}}\\ & & @{text "rhs' ="} & @{term "rhs - {Trn X r | r. Trn X r \<in> rhs}"} \\ & & @{text "r' ="} & @{term "STAR (\<Uplus> {r. Trn X r \<in> rhs})"}\\ & & \multicolumn{2}{@ {\hspace{-2mm}}l}{@{text "in"}~~@{term "append_rhs_rexp rhs' r'"}}\\ \end{tabular} \end{center} \noindent In this definition, we first delete all terms of the form @{text "(X, r)"} from @{text rhs}; then we calculate the combinded regular expressions for all @{text r} coming from the deleted @{text "(X, r)"}, and take the @{const STAR} of it; finally we append this regular expression to @{text rhs'}. It can be easily seen that this operation mimics Arden's lemma on the level of equations. The \emph{substituion-operation} takes an equation of the form @{text "X = xrhs"} and substitutes it into the right-hand side @{text rhs}. \begin{center} \begin{tabular}{rc@ {\hspace{2mm}}r@ {\hspace{1mm}}l} @{thm (lhs) Subst_def} & @{text "\<equiv>"}~~\mbox{} & \multicolumn{2}{@ {\hspace{-2mm}}l}{@{text "let"}}\\ & & @{text "rhs' ="} & @{term "rhs - {Trn X r | r. Trn X r \<in> rhs}"} \\ & & @{text "r' ="} & @{term "\<Uplus> {r. Trn X r \<in> rhs}"}\\ & & \multicolumn{2}{@ {\hspace{-2mm}}l}{@{text "in"}~~@{term "rhs' \<union> append_rhs_rexp xrhs r'"}}\\ \end{tabular} \end{center} \noindent We again delete first all occurence of @{text "(X, r)"} in @{text rhs}; we then calculate the regular expression corresponding to the deleted terms; finally we append this regular expression to @{text "xrhs"} and union it up with @{text rhs'}. When we use the substitution operation we will arrange it so that @{text "xrhs"} does not contain any occurence of @{text X}. With these two operation in place, we can define the operation that removes one equation from an equational systems @{text ES}. The operation @{const Subst_all} substitutes an equation @{text "X = xrhs"} throughout an equational system @{text ES}; @{const Remove} then completely removes such an equation from @{text ES} by substituting it to the rest of the equational system, but first eliminating all recursive occurences of @{text X} by applying @{const Arden} to @{text "xrhs"}. \begin{center} \begin{tabular}{rcl} @{thm (lhs) Subst_all_def} & @{text "\<equiv>"} & @{thm (rhs) Subst_all_def}\\ @{thm (lhs) Remove_def} & @{text "\<equiv>"} & @{thm (rhs) Remove_def} \end{tabular} \end{center} \noindent Finially, we can define how an equational system should be solved. For this we will need to iterate the elimination of an equation until only one equation will be left in the system. However, we not just want to have any equation as being the last one, but the one for which we want to calculate the regular expression. Therefore we define the iteration step so that it chooses an equation with an equivalence class that is not @{text X}. This allows us to control, which equation will be the last. We use Hilbert's choice operator, written @{text SOME}, to choose an equation to be eliminated in @{text ES}. \begin{center} \begin{tabular}{rc@ {\hspace{4mm}}r@ {\hspace{1mm}}l} @{thm (lhs) Iter_def} & @{text "\<equiv>"}~~\mbox{} & \multicolumn{2}{@ {\hspace{-4mm}}l}{@{text "let"}}\\ & & @{text "(Y, yrhs) ="} & @{term "SOME (Y, yrhs). (Y, yrhs) \<in> ES \<and> X \<noteq> Y"} \\ & & \multicolumn{2}{@ {\hspace{-4mm}}l}{@{text "in"}~~@{term "Remove ES Y yrhs"}}\\ \end{tabular} \end{center} \noindent The last definition we need applies @{term Iter} over and over again until a condition @{text COND} is \emph{not} satisfied anymore. The condition states that there are more than one equation left in the equational system @{text ES}. For this we use Isabelle/HOL's @{text while}-operator as follows: \begin{center} @{thm Solve_def} \end{center} \noindent We are not concerned here with the definition of this operator (see \cite{BerghoferNipkow00}), but note that we eliminate in each @{const Iter}-step a single equation, and therefore have a well-founded termination order by taking the cardinality of the equational system @{text ES}. This enables us to prove properties about our definition of @{const Solve} when we ``call'' it with the equivalence class @{text X} and the initial equational system @{term "Init (UNIV // \<approx>A)"} from \eqref{initcs}: \begin{center} \begin{tabular}{l} @{term "invariant (Init (UNIV // \<approx>A))"} \\ @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> invariant (Iter X ES)"}\\ @{term "\<forall>ES. invariant ES \<and> Cond ES \<longrightarrow> card (Iter X ES) < card ES"}\\ @{term "\<forall>ES. invariant ES \<and> \<not> Cond ES \<longrightarrow> P ES"}\\ \hline \multicolumn{1}{c}{@{term "P (Solve X (Init (UNIV // \<approx>A)))"}} \end{tabular} \end{center} \noindent This principle states that given an invariant (which we will specify below) we can prove a property @{text "P"} involving @{const Solve}. For this we have to discharge the following proof obligations: first the initial equational system satisfies the invariant; second that the iteration step @{text "Iter"} preserves the the invariant as long as the condition @{term Cond} holds; third that @{text "Iter"} decreases the termination order, and fourth that once the condition does not hold anymore then the property @{text P} must hold. The property @{term P} in our proof will state that @{term "Solve X (Init (UNIV // \<approx>A))"} returns with a single equation @{text "X = xrhs"}, for some @{text "xrhs"} and that this equational system still satisfies the invariant. In order to get the proof through, the invariant is composed of the following six properties: \begin{center} \begin{tabular}{@ {}rcl@ {\hspace{-13mm}}l @ {}} @{text "invariant ES"} & @{text "\<equiv>"} & @{term "finite ES"} & @{text "(finiteness)"}\\ & @{text "\<and>"} & @{thm (rhs) finite_rhs_def} & @{text "(finiteness rhs)"}\\ & @{text "\<and>"} & @{text "\<forall>(X, rhs)\<in>ES. X = \<Union>\<calL> ` rhs"} & @{text "(soundness)"}\\ & @{text "\<and>"} & @{thm (rhs) distinct_equas_def}\\ & & & @{text "(distinctness)"}\\ & @{text "\<and>"} & @{thm (rhs) ardenable_def} & @{text "(ardenable)"}\\ & @{text "\<and>"} & @{thm (rhs) valid_eqs_def} & @{text "(validity)"}\\ \end{tabular} \end{center} \noindent The first two ensure that the equational system is always finite (number of equations and number of terms in each equation); \ldots It is straightforward to prove that the inital equational system satisfies the invariant. \begin{lemma} @{thm[mode=IfThen] Init_ES_satisfies_invariant} \end{lemma} \begin{proof} Finiteness is given by the assumption and the way how we set up the initial equational system. Soundness is proved in Lem.~\ref{inv}. Distinctness follows from the fact that the equivalence classes are disjoint. The ardenable property also follows from the setup of the equational system as does validity.\qed \end{proof} \begin{lemma} @{thm[mode=IfThen] iteration_step_invariant[where xrhs="rhs"]} \end{lemma} \begin{lemma} @{thm[mode=IfThen] iteration_step_measure[simplified (no_asm), where xrhs="rhs"]} \end{lemma} \begin{proof} By assumption we know that @{text "ES"} is finite and has more than one element. Therefore there must be an element @{term "(Y, yrhs) \<in> ES"} with @{term "(Y, yrhs) \<noteq> (X, rhs)"}. Using the distictness property we can infer that @{term "Y \<noteq> X"}. We further know that @{text "Remove ES Y yrhs"} removes the equation @{text "Y = yrhs"} from the system, and therefore the cardinality of @{const Iter} strictly decreases.\qed \end{proof} \begin{lemma} If @{thm (prem 1) Solve} and @{thm (prem 2) Solve} then there exists a @{text rhs} such that @{term "Solve X (Init (UNIV // \<approx>A)) = {(X, rhs)}"} and @{term "invariant {(X, rhs)}"}. \end{lemma} \noindent With this lemma in place we can show that for every equivalence class in @{term "UNIV // \<approx>A"} there exists a regular expression. \begin{lemma}\label{every_eqcl_has_reg} @{thm[mode=IfThen] every_eqcl_has_reg} \end{lemma} \begin{proof} By the preceeding Lemma, we know that there exists a @{text "rhs"} such that @{term "Solve X (Init (UNIV // \<approx>A))"} returns the equation @{text "X = rhs"}, and that the invariant holds for this equation. That means we know @{text "X = \<Union>\<calL> ` rhs"}. We further know that this is equal to \mbox{@{text "\<Union>\<calL> ` (Arden X rhs)"}} using the properties in the invariant and Lem.~???. Using the validity property for the equation @{text "X = rhs"}, we can infer that @{term "rhss rhs \<subseteq> {X}"} and because the arden operation removes that @{text X} from @{text rhs}, that @{term "rhss (Arden X rhs) = {}"}. That means @{term "Arden X rhs"} can only consist of terms of the form @{term "Lam r"}. So we can collect those (finitely many) regular expressions and have @{term "X = L (\<Uplus>rs)"}. With this we can conclude the proof.\qed \end{proof} \noindent Lem.~\ref{every_eqcl_has_reg} allows us to finally give a proof for the first direction of the Myhill-Nerode theorem. \begin{proof}[of Thm.~\ref{myhillnerodeone}] By Lem.~\ref{every_eqcl_has_reg} we know that there exists a regular language for every equivalence class in @{term "UNIV // \<approx>A"}. Since @{text "finals A"} is a subset of @{term "UNIV // \<approx>A"}, we also know that for every equvalence class in @{term "finals A"} there exists a regular language. Moreover by assumption we know that @{term "finals A"} must be finite, and therefore there must be a finite set of regular expressions @{text "rs"} such that \begin{center} @{term "\<Union>(finals A) = L (\<Uplus>rs)"} \end{center} \noindent Since the left-hand side is equal to @{text A}, we can use @{term "\<Uplus>rs"} as the regular expression that solves the goal.\qed \end{proof}*}section {* Myhill-Nerode, Second Part *}text {* \begin{theorem} Given @{text "r"} is a regular expressions, then @{thm rexp_imp_finite}. \end{theorem} \begin{proof} By induction on the structure of @{text r}. The cases for @{const NULL}, @{const EMPTY} and @{const CHAR} are straightforward, because we can easily establish \begin{center} \begin{tabular}{l} @{thm quot_null_eq}\\ @{thm quot_empty_subset}\\ @{thm quot_char_subset} \end{tabular} \end{center} \end{proof}*}section {* Conclusion and Related Work *}text {* In this paper we took the view that a regular language as one where there exists a regular expression that matches all its strings. For us it was important to find out how far we can push this point of view. Having formalised the Myhill-Nerode theorem means pushed very far. Having the Myhill-Nerode theorem means we can formalise much of the textbook results in this subject. *}(*<*)end(*>*)