Myhill.thy
changeset 99 54aa3b6dd71c
parent 78 77583805123d
equal deleted inserted replaced
98:36f9d19be0e6 99:54aa3b6dd71c
     1 theory Myhill
     1 theory Myhill
     2   imports Myhill_2
     2   imports Myhill_2
     3 begin
     3 begin
     4 
     4 
     5 section {* Preliminaries *}
     5 section {* Preliminaries \label{sec_prelim}*}
     6 
     6 
     7 subsection {* Finite automata and \mht  \label{sec_fa_mh} *}
     7 subsection {* Finite automata and \mht  \label{sec_fa_mh} *}
     8 
     8 
     9 text {*
     9 text {*
    10 
    10 
   198   of the recursive occurence of $X_1$, this replacement does not really removed $X_1$. Arden's
   198   of the recursive occurence of $X_1$, this replacement does not really removed $X_1$. Arden's
   199   lemma is invoked to transform recursive equations like \eqref{x_1_def_o} into non-recursive 
   199   lemma is invoked to transform recursive equations like \eqref{x_1_def_o} into non-recursive 
   200   ones. For example, the recursive equation \eqref{x_1_def_o} is transformed into the follwing
   200   ones. For example, the recursive equation \eqref{x_1_def_o} is transformed into the follwing
   201   non-recursive one:
   201   non-recursive one:
   202   \begin{equation}
   202   \begin{equation}
   203      X_1  =  (X_0 \cdot a + X_2 \cdot d) \cdot b^*
   203      X_1  =  (X_0 \cdot a + X_2 \cdot d) \cdot b^* = X_0 \cdot (a \cdot b^*) + X_2 \cdot (d \cdot b^*)
   204   \end{equation}
   204   \end{equation}
   205   which, by Arden's lemma, still characterizes $X_1$ correctly. By subsitute 
   205   which, by Arden's lemma, still characterizes $X_1$ correctly. By subsituting
   206   $(X_0 \cdot a + X_2 \cdot d) \cdot b^*$ for all $X_1$ and remove  \eqref{x_1_def_o},
   206   $(X_0 \cdot a + X_2 \cdot d) \cdot b^*$ for all $X_1$ and removing  \eqref{x_1_def_o},
   207   we get:
   207   we get:
   208 
       
   209   \begin{subequations}
   208   \begin{subequations}
   210   \begin{eqnarray}
   209   \begin{eqnarray}
   211     X_0 & = & ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot c + X_2 \cdot d + \lambda \label{x_0_def_1} \\
   210     X_0 & = & \begin{aligned}
   212     X_2 & = & X_0 \cdot b + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot d + X_2 \cdot a \\
   211               & (X_0 \cdot (a \cdot b^*) + X_2 \cdot (d \cdot b^*)) \cdot c + 
       
   212                 X_2 \cdot d + \lambda = \\
       
   213               & X_0 \cdot (a \cdot b^* \cdot c) + X_2 \cdot (d \cdot b^* \cdot c) + 
       
   214                 X_2 \cdot d + \lambda = \\
       
   215               &  X_0 \cdot (a \cdot b^* \cdot c) + X_2 \cdot (d \cdot b^* \cdot c + d) + \lambda 
       
   216               \end{aligned} \label{x_0_def_1} \\
       
   217     X_2 & = & \begin{aligned}
       
   218                  & X_0 \cdot b + (X_0 \cdot (a \cdot b^*) + X_2 \cdot (d \cdot b^*)) \cdot d + X_2 \cdot a = \\
       
   219                  & X_0 \cdot b + X_0 \cdot (a \cdot b^* \cdot d) + X_2 \cdot (d \cdot b^* \cdot d) + X_2 \cdot a = \\
       
   220                  & X_0 \cdot (b + a \cdot b^* \cdot d) + X_2 \cdot (d \cdot b^* \cdot d + a)
       
   221               \end{aligned} \\
   213     X_3 & = & \begin{aligned}
   222     X_3 & = & \begin{aligned}
   214                  & X_0 \cdot (c + d + e) + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot (a + e)\\
   223                  & X_0 \cdot (c + d + e) + ((X_0 \cdot a + X_2 \cdot d) \cdot b^*) \cdot (a + e)\\
   215                  & + X_2 \cdot (b + e) + X_3 \cdot (a + b + c + d + e)
   224                  & + X_2 \cdot (b + e) + X_3 \cdot (a + b + c + d + e) \label{x_3_def_1}
   216               \end{aligned}
   225               \end{aligned}
   217   \end{eqnarray}
   226   \end{eqnarray}
   218   \end{subequations}  
   227   \end{subequations}  
   219  
   228 Suppose $X_3$ is the one to remove next, since $X_3$ dose not appear in either $X_0$ or $X_2$, 
       
   229 the removal of equation \eqref{x_3_def_1} changes nothing in the rest equations. Therefore, we get:
       
   230  \begin{subequations}
       
   231   \begin{eqnarray}
       
   232     X_0 & = & X_0 \cdot (a \cdot b^* \cdot c) + X_2 \cdot (d \cdot b^* \cdot c + d) + \lambda  \label{x_0_def_2} \\
       
   233     X_2 & = & X_0 \cdot (b + a \cdot b^* \cdot d) + X_2 \cdot (d \cdot b^* \cdot d + a) \label{x_2_def_2}
       
   234   \end{eqnarray}
       
   235   \end{subequations}  
       
   236 Actually, since absorbing state like $X_3$ contributes nothing to the language $\Lang$, it could have been removed
       
   237 at the very beginning of this precedure without affecting the final result. Now, the last unknown to remove 
       
   238 is $X_2$ and the Arden's transformaton of \eqref{x_2_def_2} is:
       
   239 \begin{equation} \label{x_2_ardened}
       
   240    X_2 ~ = ~  (X_0 \cdot (b + a \cdot b^* \cdot d)) \cdot (d \cdot b^* \cdot d + a)^* =
       
   241               X_0 \cdot ((b + a \cdot b^* \cdot d) \cdot (d \cdot b^* \cdot d + a)^*)
       
   242 \end{equation}
       
   243 By substituting the right hand side of \eqref{x_2_ardened} into \eqref{x_0_def_2}, we get:
       
   244 \begin{equation}
       
   245 \begin{aligned}
       
   246   X_0  & = && X_0 \cdot (a \cdot b^* \cdot c) + \\
       
   247        &   && X_0 \cdot ((b + a \cdot b^* \cdot d) \cdot (d \cdot b^* \cdot d + a)^*) \cdot 
       
   248            (d \cdot b^* \cdot c + d) + \lambda \\
       
   249        & =  && X_0 \cdot ((a \cdot b^* \cdot c) + \\
       
   250        &   && \hspace{3em} ((b + a \cdot b^* \cdot d) \cdot (d \cdot b^* \cdot d + a)^*) \cdot 
       
   251            (d \cdot b^* \cdot c + d)) + \lambda 
       
   252 \end{aligned}
       
   253 \end{equation}
       
   254 By applying Arden's transformation to this, we get the solution of $X_0$ as:
       
   255 \begin{equation}
       
   256 \begin{aligned}
       
   257   X_0  =  ((a \cdot b^* \cdot c) + 
       
   258                 ((b + a \cdot b^* \cdot d) \cdot (d \cdot b^* \cdot d + a)^*) \cdot 
       
   259                         (d \cdot b^* \cdot c + d))^*
       
   260 \end{aligned}
       
   261 \end{equation}
       
   262 Using the same method, solutions for $X_1$ and $X_2$ can be obtained as well and the
       
   263 regular expressoin for $\Lang$ is just $X_1 + X_2$. The formalization of this procedure
       
   264 consititues the first direction of the {\em regular expression} verion of
       
   265 \mht. Detailed explaination are given in {\bf paper.pdf} and more details and comments
       
   266 can be found in the formal scripts.
   220 *}
   267 *}
   221 
   268 
       
   269 section {* Direction @{text "finite partition \<Rightarrow> regular language"} *}
       
   270 
       
   271 text {*
       
   272   It is well known in the theory of regular languages that
       
   273   the existence of finite  language partition amounts to the existence of 
       
   274   a minimal automaton, i.e. the $M_\Lang$ constructed in section \ref{sec_prelim}, which recoginzes the 
       
   275   same language $\Lang$. The standard way to prove the existence of finite language partition 
       
   276   is to construct a automaton out of the regular expression which recoginzes the same language, from
       
   277   which the existence of finite language partition follows immediately. As discussed in the introducton of 
       
   278   {\bf paper.pdf} as well as in [5], the problem for this approach happens when automata 
       
   279   of sub regular expressions are combined to form the automaton of the mother regular expression, 
       
   280   no matter what kind of representation is used, the formalization is cubersome, as said 
       
   281   by Nipkow in [5]: `{\em a more abstract mathod is clearly desirable}'. In this section, 
       
   282   an {\em intrinsically abstract} method is given, which completely avoid the mentioning of automata,
       
   283   let along any particular representations.
       
   284   *}
       
   285 
       
   286 text {*
       
   287   The main proof structure is a structural induction on regular expressions,
       
   288   where base cases (cases for @{const "NULL"}, @{const "EMPTY"}, @{const "CHAR"}) are quite straightforward to
       
   289   proof. Real difficulty lies in inductive cases.  By inductive hypothesis, languages defined by
       
   290   sub-expressions induce finite partitiions. Under such hypothsis, we need to prove that the language 
       
   291   defined by the composite regular expression gives rise to finite partion.  
       
   292   The basic idea is to attach a tag @{text "tag(x)"} to every string 
       
   293   @{text "x"}. The tagging fuction @{text "tag"} is carefully devised, which returns tags 
       
   294   made of equivalent classes of the partitions induced by subexpressoins, and therefore has a finite
       
   295   range. Let @{text "Lang"} be the composite language, it is proved that:
       
   296   \begin{quote}
       
   297   If strings with the same tag are equivalent with respect to @{text "Lang"}, expressed as:
       
   298   \[
       
   299   @{text "tag(x) = tag(y) \<Longrightarrow> x \<approx>Lang y"}
       
   300   \]
       
   301   then the partition induced by @{text "Lang"} must be finite.
       
   302   \end{quote}
       
   303   There are two arguments for this. The first goes as the following:
       
   304   \begin{enumerate}
       
   305     \item First, the tagging function @{text "tag"} induces an equivalent relation @{text "(=tag=)"} 
       
   306           (defiintion of @{text "f_eq_rel"} and lemma @{text "equiv_f_eq_rel"}).
       
   307     \item It is shown that: if the range of @{text "tag"} (denoted @{text "range(tag)"}) is finite, 
       
   308            the partition given rise by @{text "(=tag=)"} is finite (lemma @{text "finite_eq_f_rel"}).
       
   309            Since tags are made from equivalent classes from component partitions, and the inductive
       
   310            hypothesis ensures the finiteness of these partitions, it is not difficult to prove
       
   311            the finiteness of @{text "range(tag)"}.
       
   312     \item It is proved that if equivalent relation @{text "R1"} is more refined than @{text "R2"}
       
   313            (expressed as @{text "R1 \<subseteq> R2"}),
       
   314            and the partition induced by @{text "R1"} is finite, then the partition induced by @{text "R2"}
       
   315            is finite as well (lemma @{text "refined_partition_finite"}).
       
   316     \item The injectivity assumption @{text "tag(x) = tag(y) \<Longrightarrow> x \<approx>Lang y"} implies that
       
   317             @{text "(=tag=)"} is more refined than @{text "(\<approx>Lang)"}.
       
   318     \item Combining the points above, we have: the partition induced by language @{text "Lang"}
       
   319           is finite (lemma @{text "tag_finite_imageD"}).
       
   320   \end{enumerate}
       
   321 
       
   322 We could have followed another approach given in  appendix II of Brzozowski's paper [?], where
       
   323 the set of derivatives of any regular expression can be proved to be finite. 
       
   324 Since it is easy to show that strings with same derivative are equivalent with respect to the 
       
   325 language, then the second direction follows. We believe that our
       
   326 apporoach is easy to formalize, with no need to do complicated derivation calculations
       
   327 and countings as in [???].
       
   328 *}
       
   329 
       
   330 
   222 end
   331 end