| author | urbanc | 
| Wed, 16 Feb 2011 06:51:58 +0000 | |
| changeset 108 | 212bfa431fa5 | 
| parent 99 | 54aa3b6dd71c | 
| permissions | -rw-r--r-- | 
| 31 
b6815473ee2e
1. Myhill.thy is proved to correct some typos. 2. Some sections are added to the first direction. 3. The small theory of list difference is now separated into Prefix_subtract.thy and the ROOT.ML
 zhang parents: 
30diff
changeset | 1 | theory Myhill | 
| 62 
d94209ad2880
Copy Myhill.thy into Myhill_2.thyand put Myhill_2.thy into logic ListP. Myhill.thy is now almost empty ready to be extended by new manuscripts.
 zhang parents: 
57diff
changeset | 2 | imports Myhill_2 | 
| 31 
b6815473ee2e
1. Myhill.thy is proved to correct some typos. 2. Some sections are added to the first direction. 3. The small theory of list difference is now separated into Prefix_subtract.thy and the ROOT.ML
 zhang parents: 
30diff
changeset | 3 | begin | 
| 
b6815473ee2e
1. Myhill.thy is proved to correct some typos. 2. Some sections are added to the first direction. 3. The small theory of list difference is now separated into Prefix_subtract.thy and the ROOT.ML
 zhang parents: 
30diff
changeset | 4 | |
| 99 | 5 | section {* Preliminaries \label{sec_prelim}*}
 | 
| 73 | 6 | |
| 7 | subsection {* Finite automata and \mht  \label{sec_fa_mh} *}
 | |
| 31 
b6815473ee2e
1. Myhill.thy is proved to correct some typos. 2. Some sections are added to the first direction. 3. The small theory of list difference is now separated into Prefix_subtract.thy and the ROOT.ML
 zhang parents: 
30diff
changeset | 8 | |
| 
b6815473ee2e
1. Myhill.thy is proved to correct some typos. 2. Some sections are added to the first direction. 3. The small theory of list difference is now separated into Prefix_subtract.thy and the ROOT.ML
 zhang parents: 
30diff
changeset | 9 | text {*
 | 
| 64 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 10 | |
| 69 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 11 | A {\em determinisitc finite automata (DFA)} $M$ is a 5-tuple 
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 12 | $(Q, \Sigma, \delta, s, F)$, where: | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 13 | \begin{enumerate}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 14 |   \item $Q$ is a finite set of {\em states}, also denoted $Q_M$.
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 15 |   \item $\Sigma$ is a finite set of {\em alphabets}, also denoted $\Sigma_M$.
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 16 |   \item $\delta$ is a {\em transition function} of type @{text "Q \<times> \<Sigma> \<Rightarrow> Q"} (a total function),
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 17 | also denoted $\delta_M$. | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 18 |   \item @{text "s \<in> Q"} is a state called {\em initial state}, also denoted $s_M$.
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 19 |   \item @{text "F \<subseteq> Q"} is a set of states named {\em accepting states}, also denoted $F_M$.
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 20 | \end{enumerate}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 21 | Therefore, we have $M = (Q_M, \Sigma_M, \delta_M, s_M, F_M)$. Every DFA $M$ can be interpreted as | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 22 | a function assigning states to strings, denoted $\dfa{M}$, the  definition of which is as the following:
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 23 | \begin{equation}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 24 | \begin{aligned}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 25 |   \dfa{M}([]) &\equiv s_M \\
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 26 |    \dfa{M}(xa) &\equiv  \delta_M(\dfa{M}(x), a)
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 27 | \end{aligned}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 28 | \end{equation}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 29 | A string @{text "x"} is said to be {\em accepted} (or {\em recognized}) by a DFA $M$ if 
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 30 | $\dfa{M}(x) \in F_M$. The language recoginzed by DFA $M$, denoted
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 31 | $L(M)$, is defined as: | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 32 | \begin{equation}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 33 |   L(M) \equiv \{x~|~\dfa{M}(x) \in F_M\}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 34 | \end{equation}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 35 | The standard way of specifying a laugage $\Lang$ as {\em regular} is by stipulating that:
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 36 | $\Lang = L(M)$ for some DFA $M$. | 
| 64 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 37 | |
| 69 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 38 | For any DFA $M$, the DFA obtained by changing initial state to another $p \in Q_M$ is denoted $M_p$, | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 39 | which is defined as: | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 40 | \begin{equation}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 41 | M_p \ \equiv\ (Q_M, \Sigma_M, \delta_M, p, F_M) | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 42 | \end{equation}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 43 | Two states $p, q \in Q_M$ are said to be {\em equivalent}, denoted $p \approx_M q$, iff.
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 44 | \begin{equation}\label{m_eq_def}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 45 | L(M_p) = L(M_q) | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 46 | \end{equation}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 47 | It is obvious that $\approx_M$ is an equivalent relation over $Q_M$. and | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 48 | the partition induced by $\approx_M$ has $|Q_M|$ equivalent classes. | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 49 | By overloading $\approx_M$, an equivalent relation over strings can be defined: | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 50 | \begin{equation}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 51 |   x \approx_M y ~ ~ \equiv ~ ~ \dfa{M}(x) \approx_M \dfa{M}(y)
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 52 | \end{equation}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 53 | It can be proved that the the partition induced by $\approx_M$ also has $|Q_M|$ equivalent classes. | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 54 | It is also easy to show that: if $x \approx_M y$, then $x \approx_{L(M)} y$, and this means
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 55 | $\approx_M$ is a more refined equivalent relation than $\approx_{L(M)}$. Since partition induced by
 | 
| 73 | 56 | $\approx_M$ is finite, the one induced by $\approx_{L(M)}$ must also be finite, and this is 
 | 
| 57 | one of the two directions of \mht: | |
| 58 | \begin{Lem}[\mht , Direction two]
 | |
| 69 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 59 | If a language $\Lang$ is regular (i.e. $\Lang = L(M)$ for some DFA $M$), then | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 60 | the partition induced by $\approx_\Lang$ is finite. | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 61 | \end{Lem}
 | 
| 68 
dceb84acdee8
Check in Myhill.thy before trying another way to explain DFA
 zhang parents: 
64diff
changeset | 62 | |
| 69 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 63 | The other direction is: | 
| 73 | 64 | \begin{Lem}[\mht , Direction one]\label{auto_mh_d1}
 | 
| 69 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 65 | If the partition induced by $\approx_\Lang$ is finite, then | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 66 | $\Lang$ is regular (i.e. $\Lang = L(M)$ for some DFA $M$). | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 67 | \end{Lem}
 | 
| 73 | 68 | The $M$ we are seeking when prove lemma \ref{auto_mh_d2} can be constructed out of $\approx_\Lang$,
 | 
| 69 | denoted $M_\Lang$ and defined as the following: | |
| 69 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 70 | \begin{subequations}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 71 | \begin{eqnarray}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 72 |   Q_{M_\Lang} ~ & \equiv & ~ \{ \cls{x}{\approx_\Lang}~|~ x \in \Sigma^* \}\\
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 73 |   \Sigma_{M_\Lang} ~ & \equiv & ~ \Sigma_M \\
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 74 |   \delta_{M_\Lang} ~ & \equiv & ~ \left (\lambda (\cls{x}{\approx_\Lang}, a).  \cls{xa}{\approx_\Lang} \right) \\
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 75 |   s_{M_\Lang} ~ & \equiv & ~ \cls{[]}{\approx_\Lang} \\
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 76 |   F_{M_\Lang} ~ & \equiv & ~ \{ \cls{x}{\approx_\Lang}~|~ x \in \Lang \}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 77 | \end{eqnarray}
 | 
| 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 78 | \end{subequations}
 | 
| 73 | 79 | It can be proved that $Q_{M_\Lang}$ is indeed finite and $\Lang = L(M_\Lang)$, so lemma \ref{auto_mh_d1} holds.
 | 
| 80 | It can also be proved that $M_\Lang$ is the minimal DFA (therefore unique) which recoginzes $\Lang$. | |
| 81 | ||
| 82 | ||
| 68 
dceb84acdee8
Check in Myhill.thy before trying another way to explain DFA
 zhang parents: 
64diff
changeset | 83 | |
| 69 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 84 | *} | 
| 68 
dceb84acdee8
Check in Myhill.thy before trying another way to explain DFA
 zhang parents: 
64diff
changeset | 85 | |
| 73 | 86 | subsection {* The objective and the underlying intuition *}
 | 
| 87 | ||
| 69 
ecf6c61a4541
A formal presentation of automaton and Myhill-Nerrode theorem added to serve as a basis for further discussion
 zhang parents: 
68diff
changeset | 88 | text {*
 | 
| 73 | 89 |   It is now obvious from section \ref{sec_fa_mh} that \mht\ can be established easily when
 | 
| 90 |   {\em reglar languages} are defined as ones recognized by finite automata. 
 | |
| 91 | Under the context where the use of finite automata is forbiden, the situation is quite different. | |
| 92 | The theorem now has to be expressed as: | |
| 93 |   \begin{Thm}[\mht , Regular expression version]
 | |
| 94 |       A language $\Lang$ is regular (i.e. $\Lang = L(\re)$ for some {\em regular expression} $\re$)
 | |
| 95 | iff. the partition induced by $\approx_\Lang$ is finite. | |
| 96 |   \end{Thm}
 | |
| 97 | The proof of this version consists of two directions (if the use of automata are not allowed): | |
| 98 |   \begin{description}
 | |
| 99 | \item [Direction one:] | |
| 100 | generating a regular expression $\re$ out of the finite partition induced by $\approx_\Lang$, | |
| 101 | such that $\Lang = L(\re)$. | |
| 102 | \item [Direction two:] | |
| 103 | showing the finiteness of the partition induced by $\approx_\Lang$, under the assmption | |
| 104 | that $\Lang$ is recognized by some regular expression $\re$ (i.e. $\Lang = L(\re)$). | |
| 105 |   \end{description}
 | |
| 106 | The development of these two directions consititutes the body of this paper. | |
| 64 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 107 | |
| 73 | 108 | *} | 
| 109 | ||
| 110 | section {* Direction @{text "regular language \<Rightarrow>finite partition"} *}
 | |
| 111 | ||
| 112 | text {*
 | |
| 113 | Although not used explicitly, the notion of finite autotmata and its relationship with | |
| 114 |   language partition, as outlined in section \ref{sec_fa_mh}, still servers as important intuitive 
 | |
| 115 | guides in the development of this paper. | |
| 116 |   For example, {\em Direction one} follows the {\em Brzozowski algebraic method}
 | |
| 117 | used to convert finite autotmata to regular expressions, under the intuition that every | |
| 118 |   partition member $\cls{x}{\approx_\Lang}$ is a state in the DFA $M_\Lang$  constructed to prove 
 | |
| 119 |   lemma \ref{auto_mh_d1} of section \ref{sec_fa_mh}.
 | |
| 120 | ||
| 78 | 121 | The basic idea of Brzozowski method is to extract an equational system out of the | 
| 122 | transition relationship of the automaton in question. In the equational system, every | |
| 123 | automaton state is represented by an unknown, the solution of which is expected to be | |
| 124 | a regular expresion characterizing the state in a certain sense. There are two choices of | |
| 125 | how a automaton state can be characterized. The first is to characterize by the set of | |
| 126 | strings leading from the state in question into accepting states. | |
| 127 | The other choice is to characterize by the set of strings leading from initial state | |
| 128 | into the state in question. For the second choice, the language recognized the automaton | |
| 129 | can be characterized by the solution of initial state, while for the second choice, | |
| 130 | the language recoginzed by the automaton can be characterized by | |
| 131 |   combining solutions of all accepting states by @{text "+"}. Because of the automaton 
 | |
| 132 | used as our intuitive guide, the $M_\Lang$, the states of which are | |
| 133 | sets of strings leading from initial state, the second choice is used in this paper. | |
| 73 | 134 | |
| 78 | 135 |   Supposing the automaton in Fig \ref{fig_auto_part_rel} is the $M_\Lang$ for some language $\Lang$, 
 | 
| 136 |   and suppose $\Sigma = \{a, b, c, d, e\}$. Under the second choice, the equational system extracted is:
 | |
| 137 |   \begin{subequations}
 | |
| 138 |   \begin{eqnarray}
 | |
| 139 |     X_0 & = & X_1 \cdot c + X_2 \cdot d + \lambda \label{x_0_def_o} \\
 | |
| 140 |     X_1 & = & X_0 \cdot a + X_1 \cdot b + X_2 \cdot d \label{x_1_def_o} \\
 | |
| 141 | X_2 & = & X_0 \cdot b + X_1 \cdot d + X_2 \cdot a \\ | |
| 142 |     X_3 & = & \begin{aligned}
 | |
| 143 | & X_0 \cdot (c + d + e) + X_1 \cdot (a + e) + X_2 \cdot (b + e) + \\ | |
| 144 | & X_3 \cdot (a + b + c + d + e) | |
| 145 |               \end{aligned}
 | |
| 146 |   \end{eqnarray}
 | |
| 147 |   \end{subequations} 
 | |
| 148 | ||
| 64 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 149 | \begin{figure}[h!]
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 150 | \centering | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 151 | \begin{minipage}{0.5\textwidth}
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 152 | \scalebox{0.8}{
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 153 | \begin{tikzpicture}[ultra thick,>=latex]
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 154 |   \node[draw,circle,initial] (start) {$X_0$};
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 155 |   \node[draw,circle,accepting] at ($(start) + (3.5cm,2cm)$) (ac1) {$X_1$};
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 156 |   \node[draw,circle,accepting] at ($(start) + (3.5cm,-2cm)$) (ac2) {$X_2$};
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 157 |   \node[draw,circle] at ($(start) + (6.5cm,0cm)$) (ab) {$X_3$};
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 158 | |
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 159 |   \path[->] (start) edge [bend left] node [midway, above] {$a$} (ac1);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 160 |   \path[->] (start) edge [bend right] node [midway, below] {$b$} (ac2);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 161 |   \path[->] (ac1) edge [loop above] node [midway, above] {$b$} (ac1);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 162 |   \path[->] (ac2) edge [loop below] node [midway, below] {$a$} (ac2);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 163 |   \path[->] (ac1) edge [bend right] node [midway, left] {$c$} (ac2);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 164 |   \path[->] (ac2) edge [bend right] node [midway, right] {$c$} (ac1);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 165 |   \path[->] (ac1) edge node [midway, sloped, above] {$d$} (start);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 166 |   \path[->] (ac2) edge node [midway, sloped, above] {$d$} (start);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 167 | |
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 168 | \path [draw, rounded corners,->,dashed] (start) -- ($(start) + (0cm, 3.7cm)$) | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 169 |          -- ($(ab) + (0cm, 3.7cm)$)  node[midway,above,sloped]{$\Sigma - \{a, b\}$} -- (ab);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 170 |   \path[->,dashed] (ac1) edge node [midway, above, sloped] {$\Sigma - \{b,c,d\}$} (ab);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 171 |   \path[->,dashed] (ac2) edge node [midway, below, sloped] {$\Sigma - \{a,c,d\}$} (ab);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 172 |   \path[->,dashed] (ab) edge [loop right] node [midway, right] {$\Sigma$} (ab);
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 173 | \end{tikzpicture}}
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 174 | \end{minipage}
 | 
| 78 | 175 | \caption{An example automaton}\label{fig_auto_part_rel}
 | 
| 64 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 176 | \end{figure}
 | 
| 
b69d4e04e64a
Added something to Myhill.thy, trying to explain the relationship between finite automata and language partition.
 zhang parents: 
62diff
changeset | 177 | |
| 78 | 178 | Every $\cdot$-item on the right side of equations describes some state transtions, except | 
| 179 |   the $\lambda$ in \eqref{x_0_def_o}, which represents empty string @{text "[]"}. 
 | |
| 180 | The reason is that: every state is characterized by the | |
| 181 | set of incoming strings leading from initial state. For non-initial state, every such | |
| 182 | string can be splitted into a prefix leading into a preceding state and a single character suffix | |
| 183 | transiting into from the preceding state. The exception happens at | |
| 184 | initial state, where the empty string is a incoming string which can not be splitted. The $\lambda$ | |
| 185 |   in \eqref{x_0_def_o} is introduce to repsent this indivisible string. There is one and only one
 | |
| 186 | $\lambda$ in every equational system such obtained, becasue $[]$ can only be contaied in one | |
| 187 | equivalent class (the intial state in $M_\Lang$) and equivalent classes are disjoint. | |
| 188 | ||
| 189 | Suppose all unknowns ($X_0, X_1, X_2, X_3$) are solvable, the regular expression charactering | |
| 190 | laugnage $\Lang$ is $X_1 + X_2$. This paper gives a procedure | |
| 191 | by which arbitrarily picked unknown can be solved. The basic idea to solve $X_i$ is by | |
| 192 | eliminating all variables other than $X_i$ from the equational system. If | |
| 193 | $X_0$ is the one picked to be solved, variables $X_1, X_2, X_3$ have to be removed one by | |
| 194 | one. The order to remove does not matter as long as the remaing equations are kept valid. | |
| 195 | Suppose $X_1$ is the first one to remove, the action is to replace all occurences of $X_1$ | |
| 196 | in remaining equations by the right hand side of its characterizing equation, i.e. | |
| 197 |   the $ X_0 \cdot a + X_1 \cdot b + X_2 \cdot d$ in \eqref{x_1_def_o}. However, because
 | |
| 198 | of the recursive occurence of $X_1$, this replacement does not really removed $X_1$. Arden's | |
| 199 |   lemma is invoked to transform recursive equations like \eqref{x_1_def_o} into non-recursive 
 | |
| 200 |   ones. For example, the recursive equation \eqref{x_1_def_o} is transformed into the follwing
 | |
| 201 | non-recursive one: | |
| 202 |   \begin{equation}
 | |
| 99 | 203 | X_1 = (X_0 \cdot a + X_2 \cdot d) \cdot b^* = X_0 \cdot (a \cdot b^*) + X_2 \cdot (d \cdot b^*) | 
| 78 | 204 |   \end{equation}
 | 
| 99 | 205 | which, by Arden's lemma, still characterizes $X_1$ correctly. By subsituting | 
| 206 |   $(X_0 \cdot a + X_2 \cdot d) \cdot b^*$ for all $X_1$ and removing  \eqref{x_1_def_o},
 | |
| 78 | 207 | we get: | 
| 208 |   \begin{subequations}
 | |
| 209 |   \begin{eqnarray}
 | |
| 99 | 210 |     X_0 & = & \begin{aligned}
 | 
| 211 | & (X_0 \cdot (a \cdot b^*) + X_2 \cdot (d \cdot b^*)) \cdot c + | |
| 212 | X_2 \cdot d + \lambda = \\ | |
| 213 | & X_0 \cdot (a \cdot b^* \cdot c) + X_2 \cdot (d \cdot b^* \cdot c) + | |
| 214 | X_2 \cdot d + \lambda = \\ | |
| 215 | & X_0 \cdot (a \cdot b^* \cdot c) + X_2 \cdot (d \cdot b^* \cdot c + d) + \lambda | |
| 216 |               \end{aligned} \label{x_0_def_1} \\
 | |
| 217 |     X_2 & = & \begin{aligned}
 | |
| 218 | & X_0 \cdot b + (X_0 \cdot (a \cdot b^*) + X_2 \cdot (d \cdot b^*)) \cdot d + X_2 \cdot a = \\ | |
| 219 | & X_0 \cdot b + X_0 \cdot (a \cdot b^* \cdot d) + X_2 \cdot (d \cdot b^* \cdot d) + X_2 \cdot a = \\ | |
| 220 | & X_0 \cdot (b + a \cdot b^* \cdot d) + X_2 \cdot (d \cdot b^* \cdot d + a) | |
| 221 |               \end{aligned} \\
 | |
| 78 | 222 |     X_3 & = & \begin{aligned}
 | 
| 223 | & X_0 \cdot (c + d + e) + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot (a + e)\\ | |
| 99 | 224 |                  & + X_2 \cdot (b + e) + X_3 \cdot (a + b + c + d + e) \label{x_3_def_1}
 | 
| 78 | 225 |               \end{aligned}
 | 
| 226 |   \end{eqnarray}
 | |
| 227 |   \end{subequations}  
 | |
| 99 | 228 | Suppose $X_3$ is the one to remove next, since $X_3$ dose not appear in either $X_0$ or $X_2$, | 
| 229 | the removal of equation \eqref{x_3_def_1} changes nothing in the rest equations. Therefore, we get:
 | |
| 230 |  \begin{subequations}
 | |
| 231 |   \begin{eqnarray}
 | |
| 232 |     X_0 & = & X_0 \cdot (a \cdot b^* \cdot c) + X_2 \cdot (d \cdot b^* \cdot c + d) + \lambda  \label{x_0_def_2} \\
 | |
| 233 |     X_2 & = & X_0 \cdot (b + a \cdot b^* \cdot d) + X_2 \cdot (d \cdot b^* \cdot d + a) \label{x_2_def_2}
 | |
| 234 |   \end{eqnarray}
 | |
| 235 |   \end{subequations}  
 | |
| 236 | Actually, since absorbing state like $X_3$ contributes nothing to the language $\Lang$, it could have been removed | |
| 237 | at the very beginning of this precedure without affecting the final result. Now, the last unknown to remove | |
| 238 | is $X_2$ and the Arden's transformaton of \eqref{x_2_def_2} is:
 | |
| 239 | \begin{equation} \label{x_2_ardened}
 | |
| 240 | X_2 ~ = ~ (X_0 \cdot (b + a \cdot b^* \cdot d)) \cdot (d \cdot b^* \cdot d + a)^* = | |
| 241 | X_0 \cdot ((b + a \cdot b^* \cdot d) \cdot (d \cdot b^* \cdot d + a)^*) | |
| 242 | \end{equation}
 | |
| 243 | By substituting the right hand side of \eqref{x_2_ardened} into \eqref{x_0_def_2}, we get:
 | |
| 244 | \begin{equation}
 | |
| 245 | \begin{aligned}
 | |
| 246 | X_0 & = && X_0 \cdot (a \cdot b^* \cdot c) + \\ | |
| 247 | & && X_0 \cdot ((b + a \cdot b^* \cdot d) \cdot (d \cdot b^* \cdot d + a)^*) \cdot | |
| 248 | (d \cdot b^* \cdot c + d) + \lambda \\ | |
| 249 | & = && X_0 \cdot ((a \cdot b^* \cdot c) + \\ | |
| 250 |        &   && \hspace{3em} ((b + a \cdot b^* \cdot d) \cdot (d \cdot b^* \cdot d + a)^*) \cdot 
 | |
| 251 | (d \cdot b^* \cdot c + d)) + \lambda | |
| 252 | \end{aligned}
 | |
| 253 | \end{equation}
 | |
| 254 | By applying Arden's transformation to this, we get the solution of $X_0$ as: | |
| 255 | \begin{equation}
 | |
| 256 | \begin{aligned}
 | |
| 257 | X_0 = ((a \cdot b^* \cdot c) + | |
| 258 | ((b + a \cdot b^* \cdot d) \cdot (d \cdot b^* \cdot d + a)^*) \cdot | |
| 259 | (d \cdot b^* \cdot c + d))^* | |
| 260 | \end{aligned}
 | |
| 261 | \end{equation}
 | |
| 262 | Using the same method, solutions for $X_1$ and $X_2$ can be obtained as well and the | |
| 263 | regular expressoin for $\Lang$ is just $X_1 + X_2$. The formalization of this procedure | |
| 264 | consititues the first direction of the {\em regular expression} verion of
 | |
| 265 | \mht. Detailed explaination are given in {\bf paper.pdf} and more details and comments
 | |
| 266 | can be found in the formal scripts. | |
| 78 | 267 | *} | 
| 73 | 268 | |
| 99 | 269 | section {* Direction @{text "finite partition \<Rightarrow> regular language"} *}
 | 
| 270 | ||
| 271 | text {*
 | |
| 272 | It is well known in the theory of regular languages that | |
| 273 | the existence of finite language partition amounts to the existence of | |
| 274 |   a minimal automaton, i.e. the $M_\Lang$ constructed in section \ref{sec_prelim}, which recoginzes the 
 | |
| 275 | same language $\Lang$. The standard way to prove the existence of finite language partition | |
| 276 | is to construct a automaton out of the regular expression which recoginzes the same language, from | |
| 277 | which the existence of finite language partition follows immediately. As discussed in the introducton of | |
| 278 |   {\bf paper.pdf} as well as in [5], the problem for this approach happens when automata 
 | |
| 279 | of sub regular expressions are combined to form the automaton of the mother regular expression, | |
| 280 | no matter what kind of representation is used, the formalization is cubersome, as said | |
| 281 |   by Nipkow in [5]: `{\em a more abstract mathod is clearly desirable}'. In this section, 
 | |
| 282 |   an {\em intrinsically abstract} method is given, which completely avoid the mentioning of automata,
 | |
| 283 | let along any particular representations. | |
| 284 | *} | |
| 285 | ||
| 286 | text {*
 | |
| 287 | The main proof structure is a structural induction on regular expressions, | |
| 288 |   where base cases (cases for @{const "NULL"}, @{const "EMPTY"}, @{const "CHAR"}) are quite straightforward to
 | |
| 289 | proof. Real difficulty lies in inductive cases. By inductive hypothesis, languages defined by | |
| 290 | sub-expressions induce finite partitiions. Under such hypothsis, we need to prove that the language | |
| 291 | defined by the composite regular expression gives rise to finite partion. | |
| 292 |   The basic idea is to attach a tag @{text "tag(x)"} to every string 
 | |
| 293 |   @{text "x"}. The tagging fuction @{text "tag"} is carefully devised, which returns tags 
 | |
| 294 | made of equivalent classes of the partitions induced by subexpressoins, and therefore has a finite | |
| 295 |   range. Let @{text "Lang"} be the composite language, it is proved that:
 | |
| 296 |   \begin{quote}
 | |
| 297 |   If strings with the same tag are equivalent with respect to @{text "Lang"}, expressed as:
 | |
| 298 | \[ | |
| 299 |   @{text "tag(x) = tag(y) \<Longrightarrow> x \<approx>Lang y"}
 | |
| 300 | \] | |
| 301 |   then the partition induced by @{text "Lang"} must be finite.
 | |
| 302 |   \end{quote}
 | |
| 303 | There are two arguments for this. The first goes as the following: | |
| 304 |   \begin{enumerate}
 | |
| 305 |     \item First, the tagging function @{text "tag"} induces an equivalent relation @{text "(=tag=)"} 
 | |
| 306 |           (defiintion of @{text "f_eq_rel"} and lemma @{text "equiv_f_eq_rel"}).
 | |
| 307 |     \item It is shown that: if the range of @{text "tag"} (denoted @{text "range(tag)"}) is finite, 
 | |
| 308 |            the partition given rise by @{text "(=tag=)"} is finite (lemma @{text "finite_eq_f_rel"}).
 | |
| 309 | Since tags are made from equivalent classes from component partitions, and the inductive | |
| 310 | hypothesis ensures the finiteness of these partitions, it is not difficult to prove | |
| 311 |            the finiteness of @{text "range(tag)"}.
 | |
| 312 |     \item It is proved that if equivalent relation @{text "R1"} is more refined than @{text "R2"}
 | |
| 313 |            (expressed as @{text "R1 \<subseteq> R2"}),
 | |
| 314 |            and the partition induced by @{text "R1"} is finite, then the partition induced by @{text "R2"}
 | |
| 315 |            is finite as well (lemma @{text "refined_partition_finite"}).
 | |
| 316 |     \item The injectivity assumption @{text "tag(x) = tag(y) \<Longrightarrow> x \<approx>Lang y"} implies that
 | |
| 317 |             @{text "(=tag=)"} is more refined than @{text "(\<approx>Lang)"}.
 | |
| 318 |     \item Combining the points above, we have: the partition induced by language @{text "Lang"}
 | |
| 319 |           is finite (lemma @{text "tag_finite_imageD"}).
 | |
| 320 |   \end{enumerate}
 | |
| 321 | ||
| 322 | We could have followed another approach given in appendix II of Brzozowski's paper [?], where | |
| 323 | the set of derivatives of any regular expression can be proved to be finite. | |
| 324 | Since it is easy to show that strings with same derivative are equivalent with respect to the | |
| 325 | language, then the second direction follows. We believe that our | |
| 326 | apporoach is easy to formalize, with no need to do complicated derivation calculations | |
| 327 | and countings as in [???]. | |
| 328 | *} | |
| 329 | ||
| 330 | ||
| 31 
b6815473ee2e
1. Myhill.thy is proved to correct some typos. 2. Some sections are added to the first direction. 3. The small theory of list difference is now separated into Prefix_subtract.thy and the ROOT.ML
 zhang parents: 
30diff
changeset | 331 | end |