# HG changeset patch # User zhang # Date 1297166488 0 # Node ID 77583805123d93e1b79debb6fdbf51f1e556a05f # Parent 63bc9f9d96ba90beb3a04bb2602811c3fe32017f More explaination on equational system diff -r 63bc9f9d96ba -r 77583805123d Myhill.thy --- a/Myhill.thy Tue Feb 08 09:51:49 2011 +0000 +++ b/Myhill.thy Tue Feb 08 12:01:28 2011 +0000 @@ -118,20 +118,34 @@ partition member $\cls{x}{\approx_\Lang}$ is a state in the DFA $M_\Lang$ constructed to prove lemma \ref{auto_mh_d1} of section \ref{sec_fa_mh}. - The basic idea of Brzozowski method is to set aside an unknown for every DFA state and - describe the state-trasition relationship by charateristic equations. - By solving the equational system such obtained, regular expressions - characterizing DFA states are obtained. There are choices of - how DFA states can be characterized. The first is to characterize a DFA state by the set of - striings leading from the state in question into accepting states. - The other choice is to characterize a DFA state by the set of strings leading from initial state - into the state in question. For the first choice, the lauguage recognized by a DFA can - be characterized by the regular expression characterizing initial state, while - in the second choice, the languaged of the DFA can be characterized by the summation of - regular expressions of all accepting states. -*} + The basic idea of Brzozowski method is to extract an equational system out of the + transition relationship of the automaton in question. In the equational system, every + automaton state is represented by an unknown, the solution of which is expected to be + a regular expresion characterizing the state in a certain sense. There are two choices of + how a automaton state can be characterized. The first is to characterize by the set of + strings leading from the state in question into accepting states. + The other choice is to characterize by the set of strings leading from initial state + into the state in question. For the second choice, the language recognized the automaton + can be characterized by the solution of initial state, while for the second choice, + the language recoginzed by the automaton can be characterized by + combining solutions of all accepting states by @{text "+"}. Because of the automaton + used as our intuitive guide, the $M_\Lang$, the states of which are + sets of strings leading from initial state, the second choice is used in this paper. -text {* + Supposing the automaton in Fig \ref{fig_auto_part_rel} is the $M_\Lang$ for some language $\Lang$, + and suppose $\Sigma = \{a, b, c, d, e\}$. Under the second choice, the equational system extracted is: + \begin{subequations} + \begin{eqnarray} + X_0 & = & X_1 \cdot c + X_2 \cdot d + \lambda \label{x_0_def_o} \\ + X_1 & = & X_0 \cdot a + X_1 \cdot b + X_2 \cdot d \label{x_1_def_o} \\ + X_2 & = & X_0 \cdot b + X_1 \cdot d + X_2 \cdot a \\ + X_3 & = & \begin{aligned} + & X_0 \cdot (c + d + e) + X_1 \cdot (a + e) + X_2 \cdot (b + e) + \\ + & X_3 \cdot (a + b + c + d + e) + \end{aligned} + \end{eqnarray} + \end{subequations} + \begin{figure}[h!] \centering \begin{minipage}{0.5\textwidth} @@ -158,10 +172,51 @@ \path[->,dashed] (ab) edge [loop right] node [midway, right] {$\Sigma$} (ab); \end{tikzpicture}} \end{minipage} -\caption{The relationship between automata and finite partition}\label{fig_auto_part_rel} +\caption{An example automaton}\label{fig_auto_part_rel} \end{figure} - *} + Every $\cdot$-item on the right side of equations describes some state transtions, except + the $\lambda$ in \eqref{x_0_def_o}, which represents empty string @{text "[]"}. + The reason is that: every state is characterized by the + set of incoming strings leading from initial state. For non-initial state, every such + string can be splitted into a prefix leading into a preceding state and a single character suffix + transiting into from the preceding state. The exception happens at + initial state, where the empty string is a incoming string which can not be splitted. The $\lambda$ + in \eqref{x_0_def_o} is introduce to repsent this indivisible string. There is one and only one + $\lambda$ in every equational system such obtained, becasue $[]$ can only be contaied in one + equivalent class (the intial state in $M_\Lang$) and equivalent classes are disjoint. + + Suppose all unknowns ($X_0, X_1, X_2, X_3$) are solvable, the regular expression charactering + laugnage $\Lang$ is $X_1 + X_2$. This paper gives a procedure + by which arbitrarily picked unknown can be solved. The basic idea to solve $X_i$ is by + eliminating all variables other than $X_i$ from the equational system. If + $X_0$ is the one picked to be solved, variables $X_1, X_2, X_3$ have to be removed one by + one. The order to remove does not matter as long as the remaing equations are kept valid. + Suppose $X_1$ is the first one to remove, the action is to replace all occurences of $X_1$ + in remaining equations by the right hand side of its characterizing equation, i.e. + the $ X_0 \cdot a + X_1 \cdot b + X_2 \cdot d$ in \eqref{x_1_def_o}. However, because + of the recursive occurence of $X_1$, this replacement does not really removed $X_1$. Arden's + lemma is invoked to transform recursive equations like \eqref{x_1_def_o} into non-recursive + ones. For example, the recursive equation \eqref{x_1_def_o} is transformed into the follwing + non-recursive one: + \begin{equation} + X_1 = (X_0 \cdot a + X_2 \cdot d) \cdot b^* + \end{equation} + which, by Arden's lemma, still characterizes $X_1$ correctly. By subsitute + $(X_0 \cdot a + X_2 \cdot d) \cdot b^*$ for all $X_1$ and remove \eqref{x_1_def_o}, + we get: + \begin{subequations} + \begin{eqnarray} + X_0 & = & ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot c + X_2 \cdot d + \lambda \label{x_0_def_1} \\ + X_2 & = & X_0 \cdot b + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot d + X_2 \cdot a \\ + X_3 & = & \begin{aligned} + & X_0 \cdot (c + d + e) + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot (a + e)\\ + & + X_2 \cdot (b + e) + X_3 \cdot (a + b + c + d + e) + \end{aligned} + \end{eqnarray} + \end{subequations} + +*} end