regexp: comparison Myhill.thy

equal deleted inserted replaced

-:63bc9f9d96ba
+:77583805123d
 For example, {\em Direction one} follows the {\em Brzozowski algebraic method}
 used to convert finite autotmata to regular expressions, under the intuition that every
 partition member $\cls{x}{\approx_\Lang}$ is a state in the DFA $M_\Lang$  constructed to prove
 lemma \ref{auto_mh_d1} of section \ref{sec_fa_mh}.
-The basic idea of Brzozowski method is to set aside an unknown for every DFA state and
+The basic idea of Brzozowski method is to extract an equational system out of the
-describe the state-trasition relationship by charateristic equations.
+transition relationship of the automaton in question. In the equational system, every
-By solving the equational system such obtained, regular expressions
+automaton state is represented by an unknown, the solution of which is expected to be
-characterizing DFA states are obtained. There are choices of
+a regular expresion characterizing the state in a certain sense. There are two choices of
-how DFA states can be characterized.  The first is to characterize a DFA state by the set of
+how a automaton state can be characterized.  The first is to characterize by the set of
-striings leading from the state in question into accepting states.
+strings leading from the state in question into accepting states.
-The other choice is to characterize a DFA state by the set of strings leading from initial state
+The other choice is to characterize by the set of strings leading from initial state
-into the state in question. For the first choice, the lauguage recognized by a DFA can
+into the state in question. For the second choice, the language recognized the automaton
-be characterized by the regular expression characterizing initial state, while
+can be characterized by the solution of initial state, while for the second choice,
-in the second choice, the languaged of the DFA can be characterized by the summation of
+the language recoginzed by the automaton can be characterized by
-regular expressions of all accepting states.
+combining solutions of all accepting states by @{text "+"}. Because of the automaton
-*}
+used as our intuitive guide, the $M_\Lang$, the states of which are
+sets of strings leading from initial state, the second choice is used in this paper.
-text {*
+Supposing the automaton in Fig \ref{fig_auto_part_rel} is the $M_\Lang$ for some language $\Lang$,
+and suppose $\Sigma = \{a, b, c, d, e\}$. Under the second choice, the equational system extracted is:
+\begin{subequations}
+\begin{eqnarray}
+X_0 & = & X_1 \cdot c + X_2 \cdot d + \lambda \label{x_0_def_o} \\
+X_1 & = & X_0 \cdot a + X_1 \cdot b + X_2 \cdot d \label{x_1_def_o} \\
+X_2 & = & X_0 \cdot b + X_1 \cdot d + X_2 \cdot a \\
+X_3 & = & \begin{aligned}
+& X_0 \cdot (c + d + e) + X_1 \cdot (a + e) + X_2 \cdot (b + e) + \\
+& X_3 \cdot (a + b + c + d + e)
+\end{aligned}
+\end{eqnarray}
+\end{subequations}
 \begin{figure}[h!]
 \centering
 \begin{minipage}{0.5\textwidth}
 \scalebox{0.8}{
 \begin{tikzpicture}[ultra thick,>=latex]
 \path[->,dashed] (ac1) edge node [midway, above, sloped] {$\Sigma - \{b,c,d\}$} (ab);
 \path[->,dashed] (ac2) edge node [midway, below, sloped] {$\Sigma - \{a,c,d\}$} (ab);
 \path[->,dashed] (ab) edge [loop right] node [midway, right] {$\Sigma$} (ab);
 \end{tikzpicture}}
 \end{minipage}
-\caption{The relationship between automata and finite partition}\label{fig_auto_part_rel}
+\caption{An example automaton}\label{fig_auto_part_rel}
 \end{figure}
-*}
+Every $\cdot$-item on the right side of equations describes some state transtions, except
+the $\lambda$ in \eqref{x_0_def_o}, which represents empty string @{text "[]"}.
+The reason is that: every state is characterized by the
+set of incoming strings leading from initial state. For non-initial state, every such
+string can be splitted into a prefix leading into a preceding state and a single character suffix
+transiting into from the preceding state. The exception happens at
+initial state, where the empty string is a incoming string which can not be splitted. The $\lambda$
+in \eqref{x_0_def_o} is introduce to repsent this indivisible string. There is one and only one
+$\lambda$ in every equational system such obtained, becasue $[]$ can only be contaied in one
+equivalent class (the intial state in $M_\Lang$) and equivalent classes are disjoint.
+Suppose all unknowns ($X_0, X_1, X_2, X_3$) are  solvable, the regular expression charactering
+laugnage $\Lang$ is $X_1 + X_2$. This paper gives a procedure
+by which arbitrarily picked unknown can be solved. The basic idea to solve $X_i$ is by
+eliminating all variables other than $X_i$ from the equational system. If
+$X_0$ is the one picked to be solved,  variables $X_1, X_2, X_3$ have to be removed one by
+one.  The order to remove does not matter as long as the remaing equations are kept valid.
+Suppose $X_1$ is the first one to remove, the action is to replace all occurences of $X_1$
+in remaining equations by the right hand side of its characterizing equation, i.e.
+the $ X_0 \cdot a + X_1 \cdot b + X_2 \cdot d$ in \eqref{x_1_def_o}. However, because
+of the recursive occurence of $X_1$, this replacement does not really removed $X_1$. Arden's
+lemma is invoked to transform recursive equations like \eqref{x_1_def_o} into non-recursive
+ones. For example, the recursive equation \eqref{x_1_def_o} is transformed into the follwing
+non-recursive one:
+\begin{equation}
+X_1  =  (X_0 \cdot a + X_2 \cdot d) \cdot b^*
+\end{equation}
+which, by Arden's lemma, still characterizes $X_1$ correctly. By subsitute
+$(X_0 \cdot a + X_2 \cdot d) \cdot b^*$ for all $X_1$ and remove  \eqref{x_1_def_o},
+we get:
+\begin{subequations}
+\begin{eqnarray}
+X_0 & = & ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot c + X_2 \cdot d + \lambda \label{x_0_def_1} \\
+X_2 & = & X_0 \cdot b + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot d + X_2 \cdot a \\
+X_3 & = & \begin{aligned}
+& X_0 \cdot (c + d + e) + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot (a + e)\\
+& + X_2 \cdot (b + e) + X_3 \cdot (a + b + c + d + e)
+\end{aligned}
+\end{eqnarray}
+\end{subequations}
+*}
 end

changeset 78	77583805123d
parent 73	79de7de104c8
child 99	54aa3b6dd71c