Myhill.thy
changeset 78 77583805123d
parent 73 79de7de104c8
child 99 54aa3b6dd71c
equal deleted inserted replaced
77:63bc9f9d96ba 78:77583805123d
   116   For example, {\em Direction one} follows the {\em Brzozowski algebraic method}
   116   For example, {\em Direction one} follows the {\em Brzozowski algebraic method}
   117   used to convert finite autotmata to regular expressions, under the intuition that every 
   117   used to convert finite autotmata to regular expressions, under the intuition that every 
   118   partition member $\cls{x}{\approx_\Lang}$ is a state in the DFA $M_\Lang$  constructed to prove 
   118   partition member $\cls{x}{\approx_\Lang}$ is a state in the DFA $M_\Lang$  constructed to prove 
   119   lemma \ref{auto_mh_d1} of section \ref{sec_fa_mh}.
   119   lemma \ref{auto_mh_d1} of section \ref{sec_fa_mh}.
   120 
   120 
   121   The basic idea of Brzozowski method is to set aside an unknown for every DFA state and 
   121   The basic idea of Brzozowski method is to extract an equational system out of the 
   122   describe the state-trasition relationship by charateristic equations. 
   122   transition relationship of the automaton in question. In the equational system, every
   123   By solving the equational system such obtained, regular expressions 
   123   automaton state is represented by an unknown, the solution of which is expected to be 
   124   characterizing DFA states are obtained. There are choices of 
   124   a regular expresion characterizing the state in a certain sense. There are two choices of 
   125   how DFA states can be characterized.  The first is to characterize a DFA state by the set of 
   125   how a automaton state can be characterized.  The first is to characterize by the set of 
   126   striings leading from the state in question into accepting states. 
   126   strings leading from the state in question into accepting states. 
   127   The other choice is to characterize a DFA state by the set of strings leading from initial state 
   127   The other choice is to characterize by the set of strings leading from initial state 
   128   into the state in question. For the first choice, the lauguage recognized by a DFA can 
   128   into the state in question. For the second choice, the language recognized the automaton 
   129   be characterized by the regular expression characterizing initial state, while 
   129   can be characterized by the solution of initial state, while for the second choice, 
   130   in the second choice, the languaged of the DFA can be characterized by the summation of
   130   the language recoginzed by the automaton can be characterized by 
   131   regular expressions of all accepting states.
   131   combining solutions of all accepting states by @{text "+"}. Because of the automaton 
   132 *}
   132   used as our intuitive guide, the $M_\Lang$, the states of which are 
   133 
   133   sets of strings leading from initial state, the second choice is used in this paper.
   134 text {*
   134 
       
   135   Supposing the automaton in Fig \ref{fig_auto_part_rel} is the $M_\Lang$ for some language $\Lang$, 
       
   136   and suppose $\Sigma = \{a, b, c, d, e\}$. Under the second choice, the equational system extracted is:
       
   137   \begin{subequations}
       
   138   \begin{eqnarray}
       
   139     X_0 & = & X_1 \cdot c + X_2 \cdot d + \lambda \label{x_0_def_o} \\
       
   140     X_1 & = & X_0 \cdot a + X_1 \cdot b + X_2 \cdot d \label{x_1_def_o} \\
       
   141     X_2 & = & X_0 \cdot b + X_1 \cdot d + X_2 \cdot a \\
       
   142     X_3 & = & \begin{aligned}
       
   143                  & X_0 \cdot (c + d + e) + X_1 \cdot (a + e) + X_2 \cdot (b + e) + \\
       
   144                  & X_3 \cdot (a + b + c + d + e)
       
   145               \end{aligned}
       
   146   \end{eqnarray}
       
   147   \end{subequations} 
       
   148 
   135 \begin{figure}[h!]
   149 \begin{figure}[h!]
   136 \centering
   150 \centering
   137 \begin{minipage}{0.5\textwidth}
   151 \begin{minipage}{0.5\textwidth}
   138 \scalebox{0.8}{
   152 \scalebox{0.8}{
   139 \begin{tikzpicture}[ultra thick,>=latex]
   153 \begin{tikzpicture}[ultra thick,>=latex]
   156   \path[->,dashed] (ac1) edge node [midway, above, sloped] {$\Sigma - \{b,c,d\}$} (ab);
   170   \path[->,dashed] (ac1) edge node [midway, above, sloped] {$\Sigma - \{b,c,d\}$} (ab);
   157   \path[->,dashed] (ac2) edge node [midway, below, sloped] {$\Sigma - \{a,c,d\}$} (ab);
   171   \path[->,dashed] (ac2) edge node [midway, below, sloped] {$\Sigma - \{a,c,d\}$} (ab);
   158   \path[->,dashed] (ab) edge [loop right] node [midway, right] {$\Sigma$} (ab);
   172   \path[->,dashed] (ab) edge [loop right] node [midway, right] {$\Sigma$} (ab);
   159 \end{tikzpicture}}
   173 \end{tikzpicture}}
   160 \end{minipage}
   174 \end{minipage}
   161 \caption{The relationship between automata and finite partition}\label{fig_auto_part_rel}
   175 \caption{An example automaton}\label{fig_auto_part_rel}
   162 \end{figure}
   176 \end{figure}
   163 
   177 
   164   *}
   178   Every $\cdot$-item on the right side of equations describes some state transtions, except 
   165 
   179   the $\lambda$ in \eqref{x_0_def_o}, which represents empty string @{text "[]"}. 
       
   180   The reason is that: every state is characterized by the
       
   181   set of incoming strings leading from initial state. For non-initial state, every such
       
   182   string can be splitted into a prefix leading into a preceding state and a single character suffix 
       
   183   transiting into from the preceding state. The exception happens at
       
   184   initial state, where the empty string is a incoming string which can not be splitted. The $\lambda$
       
   185   in \eqref{x_0_def_o} is introduce to repsent this indivisible string. There is one and only one
       
   186   $\lambda$ in every equational system such obtained, becasue $[]$ can only be contaied in one
       
   187   equivalent class (the intial state in $M_\Lang$) and equivalent classes are disjoint. 
       
   188   
       
   189   Suppose all unknowns ($X_0, X_1, X_2, X_3$) are  solvable, the regular expression charactering 
       
   190   laugnage $\Lang$ is $X_1 + X_2$. This paper gives a procedure
       
   191   by which arbitrarily picked unknown can be solved. The basic idea to solve $X_i$ is by 
       
   192   eliminating all variables other than $X_i$ from the equational system. If
       
   193   $X_0$ is the one picked to be solved,  variables $X_1, X_2, X_3$ have to be removed one by 
       
   194   one.  The order to remove does not matter as long as the remaing equations are kept valid.
       
   195   Suppose $X_1$ is the first one to remove, the action is to replace all occurences of $X_1$ 
       
   196   in remaining equations by the right hand side of its characterizing equation, i.e. 
       
   197   the $ X_0 \cdot a + X_1 \cdot b + X_2 \cdot d$ in \eqref{x_1_def_o}. However, because
       
   198   of the recursive occurence of $X_1$, this replacement does not really removed $X_1$. Arden's
       
   199   lemma is invoked to transform recursive equations like \eqref{x_1_def_o} into non-recursive 
       
   200   ones. For example, the recursive equation \eqref{x_1_def_o} is transformed into the follwing
       
   201   non-recursive one:
       
   202   \begin{equation}
       
   203      X_1  =  (X_0 \cdot a + X_2 \cdot d) \cdot b^*
       
   204   \end{equation}
       
   205   which, by Arden's lemma, still characterizes $X_1$ correctly. By subsitute 
       
   206   $(X_0 \cdot a + X_2 \cdot d) \cdot b^*$ for all $X_1$ and remove  \eqref{x_1_def_o},
       
   207   we get:
       
   208 
       
   209   \begin{subequations}
       
   210   \begin{eqnarray}
       
   211     X_0 & = & ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot c + X_2 \cdot d + \lambda \label{x_0_def_1} \\
       
   212     X_2 & = & X_0 \cdot b + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot d + X_2 \cdot a \\
       
   213     X_3 & = & \begin{aligned}
       
   214                  & X_0 \cdot (c + d + e) + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot (a + e)\\
       
   215                  & + X_2 \cdot (b + e) + X_3 \cdot (a + b + c + d + e)
       
   216               \end{aligned}
       
   217   \end{eqnarray}
       
   218   \end{subequations}  
       
   219  
       
   220 *}
   166 
   221 
   167 end
   222 end