diff -r 92ca56c1a199 -r 288637d9dcde Journal/Paper.thy --- a/Journal/Paper.thy Thu Sep 12 17:20:48 2013 +0100 +++ b/Journal/Paper.thy Sat Sep 14 14:08:19 2013 +0100 @@ -341,10 +341,10 @@ not in the predicate for regularity; and there is no type quantification available in HOL.\footnote{Slind already pointed out this problem in an email to the HOL4 mailing list on 21st April - 2005.} Coq, for example, has quantification over types and thus can - state such a definition. This has been recently exploited in a - slick formalisation of the Myhill-Nerode theorem in Coq by - \citeN{XXX}. + 2005.} Systems such as Coq permit quantification over types and thus can + state such a definition. This has been recently exploited in an + elegant and short formalisation of the Myhill-Nerode theorem in Coq by + \citeN{Smolka13}, but as said this is not avaible to us working in Isabelle/HOL. An alternative, which provides us with a single type for states of automata, is to give every state node an identity, for example a natural number, and @@ -470,6 +470,9 @@ ``{\it \ldots [b]ut is there a rigorous way to even state that?}'' \end{quote} + \noindent + We shall give an answer to these questions with our paper. + %Moreover, no side-conditions will be needed for regular expressions, %like we need for automata. @@ -773,14 +776,7 @@ equivalence classes and only finitely many characters. The term @{text "\(ONE)"} in the first equation acts as a marker for the initial state, that is the equivalence class - containing the empty string @{text "[]"}.\footnote{Note that we mark, roughly speaking, the - single `initial' state in the equational system, which is different from - the method by \citeN{Brzozowski64}, where he marks the - `terminal' states. We are forced to set up the equational system in our - way, because the Myhill-Nerode Relation determines the `direction' of the - transitions---the successor `state' of an equivalence class @{text Y} can - be reached by adding a character to the end of @{text Y}. This is also the - reason why we have to use our reversed version of Arden's Lemma.} + containing the empty string @{text "[]"}. In our running example we have the initial equational system % \begin{equation}\label{exmpcs} @@ -793,7 +789,17 @@ \end{tabular}} \end{equation} - \noindent + Note that we mark, roughly speaking, the + single `initial' state in the equational system, which is different from + the method by \citeN{Brzozowski64}, where he marks the + `terminal' states. We are forced to set up the equational system in our + way, because the Myhill-Nerode Relation determines the `direction' of the + transitions---the successor `state' of an equivalence class @{text Y} can + be reached by adding a character to the end of @{text Y}. If we had defined + our equations the `Brzozowski-way', then our variables do not correspond to + the equivalence classes generated by the Myhill-Nerode relation. This need of reverse marking is also the + reason why we have to use our reversed version of Arden's Lemma. + Overloading the function @{text \} for the two kinds of terms in the equational system, we have @@ -818,10 +824,12 @@ \end{equation} \noindent - holds. The reason for adding the @{text \}-marker to our initial equational system is + holds. Again note that the reason for adding the @{text \}-marker to our initial equational system is to obtain this equation: it only holds with the marker, since none of - the other terms contain the empty string. The point of the initial equational system is - that solving it means we will be able to extract a regular expression for every equivalence class. + the other terms contain the empty string. + The point of the initial equational system is + that solving it means we will be able to extract a regular expression for every equivalence class + generated by the Myhill-Nerode relation. Our representation for the equations in Isabelle/HOL are pairs, where the first component is an equivalence class (a set of strings) @@ -1958,8 +1966,11 @@ \noindent Carrying this idea through, we must not consider the set of all derivatives, but the one modulo @{text "ACI"}. In principle, this can be done formally, - but it is very painful in a theorem prover (since there is no - direct characterisation of the set of dissimilar derivatives). + but it is very painful in a theorem prover---since there is no + direct characterisation of the set of dissimilar derivatives. Therefore + \citeN{CoquandSiles12} who follow through this idea had to `stand on their heads' and + first formalise the quite intricate notion of \emph{inductively finite sets} in + order to formalise this property. Fortunately, there is a much simpler approach using \emph{partial @@ -2668,7 +2679,8 @@ Our future work will focus on formalising the regular expression matchers developed by \citeN{Sulzmann12} which generate variable assignments for - regular expression submatching.\smallskip + regular expression sub-matching. For this they use both, derivatives and + partial derivatives of regular expressions.\smallskip \noindent {\bf Acknowledgements:}