116 For example, {\em Direction one} follows the {\em Brzozowski algebraic method} |
116 For example, {\em Direction one} follows the {\em Brzozowski algebraic method} |
117 used to convert finite autotmata to regular expressions, under the intuition that every |
117 used to convert finite autotmata to regular expressions, under the intuition that every |
118 partition member $\cls{x}{\approx_\Lang}$ is a state in the DFA $M_\Lang$ constructed to prove |
118 partition member $\cls{x}{\approx_\Lang}$ is a state in the DFA $M_\Lang$ constructed to prove |
119 lemma \ref{auto_mh_d1} of section \ref{sec_fa_mh}. |
119 lemma \ref{auto_mh_d1} of section \ref{sec_fa_mh}. |
120 |
120 |
121 The basic idea of Brzozowski method is to set aside an unknown for every DFA state and |
121 The basic idea of Brzozowski method is to extract an equational system out of the |
122 describe the state-trasition relationship by charateristic equations. |
122 transition relationship of the automaton in question. In the equational system, every |
123 By solving the equational system such obtained, regular expressions |
123 automaton state is represented by an unknown, the solution of which is expected to be |
124 characterizing DFA states are obtained. There are choices of |
124 a regular expresion characterizing the state in a certain sense. There are two choices of |
125 how DFA states can be characterized. The first is to characterize a DFA state by the set of |
125 how a automaton state can be characterized. The first is to characterize by the set of |
126 striings leading from the state in question into accepting states. |
126 strings leading from the state in question into accepting states. |
127 The other choice is to characterize a DFA state by the set of strings leading from initial state |
127 The other choice is to characterize by the set of strings leading from initial state |
128 into the state in question. For the first choice, the lauguage recognized by a DFA can |
128 into the state in question. For the second choice, the language recognized the automaton |
129 be characterized by the regular expression characterizing initial state, while |
129 can be characterized by the solution of initial state, while for the second choice, |
130 in the second choice, the languaged of the DFA can be characterized by the summation of |
130 the language recoginzed by the automaton can be characterized by |
131 regular expressions of all accepting states. |
131 combining solutions of all accepting states by @{text "+"}. Because of the automaton |
132 *} |
132 used as our intuitive guide, the $M_\Lang$, the states of which are |
133 |
133 sets of strings leading from initial state, the second choice is used in this paper. |
134 text {* |
134 |
|
135 Supposing the automaton in Fig \ref{fig_auto_part_rel} is the $M_\Lang$ for some language $\Lang$, |
|
136 and suppose $\Sigma = \{a, b, c, d, e\}$. Under the second choice, the equational system extracted is: |
|
137 \begin{subequations} |
|
138 \begin{eqnarray} |
|
139 X_0 & = & X_1 \cdot c + X_2 \cdot d + \lambda \label{x_0_def_o} \\ |
|
140 X_1 & = & X_0 \cdot a + X_1 \cdot b + X_2 \cdot d \label{x_1_def_o} \\ |
|
141 X_2 & = & X_0 \cdot b + X_1 \cdot d + X_2 \cdot a \\ |
|
142 X_3 & = & \begin{aligned} |
|
143 & X_0 \cdot (c + d + e) + X_1 \cdot (a + e) + X_2 \cdot (b + e) + \\ |
|
144 & X_3 \cdot (a + b + c + d + e) |
|
145 \end{aligned} |
|
146 \end{eqnarray} |
|
147 \end{subequations} |
|
148 |
135 \begin{figure}[h!] |
149 \begin{figure}[h!] |
136 \centering |
150 \centering |
137 \begin{minipage}{0.5\textwidth} |
151 \begin{minipage}{0.5\textwidth} |
138 \scalebox{0.8}{ |
152 \scalebox{0.8}{ |
139 \begin{tikzpicture}[ultra thick,>=latex] |
153 \begin{tikzpicture}[ultra thick,>=latex] |
156 \path[->,dashed] (ac1) edge node [midway, above, sloped] {$\Sigma - \{b,c,d\}$} (ab); |
170 \path[->,dashed] (ac1) edge node [midway, above, sloped] {$\Sigma - \{b,c,d\}$} (ab); |
157 \path[->,dashed] (ac2) edge node [midway, below, sloped] {$\Sigma - \{a,c,d\}$} (ab); |
171 \path[->,dashed] (ac2) edge node [midway, below, sloped] {$\Sigma - \{a,c,d\}$} (ab); |
158 \path[->,dashed] (ab) edge [loop right] node [midway, right] {$\Sigma$} (ab); |
172 \path[->,dashed] (ab) edge [loop right] node [midway, right] {$\Sigma$} (ab); |
159 \end{tikzpicture}} |
173 \end{tikzpicture}} |
160 \end{minipage} |
174 \end{minipage} |
161 \caption{The relationship between automata and finite partition}\label{fig_auto_part_rel} |
175 \caption{An example automaton}\label{fig_auto_part_rel} |
162 \end{figure} |
176 \end{figure} |
163 |
177 |
164 *} |
178 Every $\cdot$-item on the right side of equations describes some state transtions, except |
165 |
179 the $\lambda$ in \eqref{x_0_def_o}, which represents empty string @{text "[]"}. |
|
180 The reason is that: every state is characterized by the |
|
181 set of incoming strings leading from initial state. For non-initial state, every such |
|
182 string can be splitted into a prefix leading into a preceding state and a single character suffix |
|
183 transiting into from the preceding state. The exception happens at |
|
184 initial state, where the empty string is a incoming string which can not be splitted. The $\lambda$ |
|
185 in \eqref{x_0_def_o} is introduce to repsent this indivisible string. There is one and only one |
|
186 $\lambda$ in every equational system such obtained, becasue $[]$ can only be contaied in one |
|
187 equivalent class (the intial state in $M_\Lang$) and equivalent classes are disjoint. |
|
188 |
|
189 Suppose all unknowns ($X_0, X_1, X_2, X_3$) are solvable, the regular expression charactering |
|
190 laugnage $\Lang$ is $X_1 + X_2$. This paper gives a procedure |
|
191 by which arbitrarily picked unknown can be solved. The basic idea to solve $X_i$ is by |
|
192 eliminating all variables other than $X_i$ from the equational system. If |
|
193 $X_0$ is the one picked to be solved, variables $X_1, X_2, X_3$ have to be removed one by |
|
194 one. The order to remove does not matter as long as the remaing equations are kept valid. |
|
195 Suppose $X_1$ is the first one to remove, the action is to replace all occurences of $X_1$ |
|
196 in remaining equations by the right hand side of its characterizing equation, i.e. |
|
197 the $ X_0 \cdot a + X_1 \cdot b + X_2 \cdot d$ in \eqref{x_1_def_o}. However, because |
|
198 of the recursive occurence of $X_1$, this replacement does not really removed $X_1$. Arden's |
|
199 lemma is invoked to transform recursive equations like \eqref{x_1_def_o} into non-recursive |
|
200 ones. For example, the recursive equation \eqref{x_1_def_o} is transformed into the follwing |
|
201 non-recursive one: |
|
202 \begin{equation} |
|
203 X_1 = (X_0 \cdot a + X_2 \cdot d) \cdot b^* |
|
204 \end{equation} |
|
205 which, by Arden's lemma, still characterizes $X_1$ correctly. By subsitute |
|
206 $(X_0 \cdot a + X_2 \cdot d) \cdot b^*$ for all $X_1$ and remove \eqref{x_1_def_o}, |
|
207 we get: |
|
208 |
|
209 \begin{subequations} |
|
210 \begin{eqnarray} |
|
211 X_0 & = & ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot c + X_2 \cdot d + \lambda \label{x_0_def_1} \\ |
|
212 X_2 & = & X_0 \cdot b + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot d + X_2 \cdot a \\ |
|
213 X_3 & = & \begin{aligned} |
|
214 & X_0 \cdot (c + d + e) + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot (a + e)\\ |
|
215 & + X_2 \cdot (b + e) + X_3 \cdot (a + b + c + d + e) |
|
216 \end{aligned} |
|
217 \end{eqnarray} |
|
218 \end{subequations} |
|
219 |
|
220 *} |
166 |
221 |
167 end |
222 end |