updated
authorChristian Urban <christian dot urban at kcl dot ac dot uk>
Tue, 08 Mar 2016 11:09:12 +0000
changeset 136 fa0d8aa5d7de
parent 135 fee5641c5994
child 137 4178b7e71809
updated
thys/Paper/Paper.thy
thys/paper.pdf
--- a/thys/Paper/Paper.thy	Tue Mar 08 10:46:43 2016 +0000
+++ b/thys/Paper/Paper.thy	Tue Mar 08 11:09:12 2016 +0000
@@ -368,12 +368,12 @@
   \begin{center}
   \begin{tabular}{c}
   @{thm[mode=Axiom] Prf.intros(4)} \qquad
-  @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\medskip\\
+  @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\smallskip\\
   @{thm[mode=Rule] Prf.intros(2)[of "v\<^sub>1" "r\<^sub>1" "r\<^sub>2"]} \qquad 
-  @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\medskip\\
-  @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\medskip\\ 
+  @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\smallskip\\
+  @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\smallskip\\ 
   @{thm[mode=Axiom] Prf.intros(6)[of "r"]} \qquad  
-  @{thm[mode=Rule] Prf.intros(7)[of "v" "r" "vs"]}\medskip\\
+  @{thm[mode=Rule] Prf.intros(7)[of "v" "r" "vs"]}
   \end{tabular}
   \end{center}
 
@@ -402,7 +402,7 @@
   whether the resulting derivative regular expression @{term "r\<^sub>4"}
   can match the empty string. If yes, we call the function @{const mkeps}
   that produces a value @{term "v\<^sub>4"} for how @{term "r\<^sub>4"} can
-  match the empty string (taking into account the POSIX rules in case
+  match the empty string (taking into account the POSIX constraints in case
   there are several ways). This functions is defined by the clauses:
 
 \begin{figure}[t]
@@ -430,12 +430,12 @@
 \end{center}
 \caption{The two phases of the algorithm by Sulzmann \& Lu \cite{Sulzmann2014},
 matching the string @{term "[a,b,c]"}. The first phase (the arrows from 
-left to right) is \Brz's matcher building succesive derivatives. If the 
+left to right) is \Brz's matcher building successive derivatives. If the 
 last regular expression is @{term nullable}, then the functions of the 
 second phase are called (the top-down and right-to-left arrows): first 
 @{term mkeps} calculates a value witnessing
 how the empty string has been recognised by @{term "r\<^sub>4"}. After
-that the function @{term inj} `injects back' the characters of the string into
+that the function @{term inj} ``injects back'' the characters of the string into
 the values.
 \label{Sulz}}
 \end{figure} 
@@ -465,7 +465,7 @@
   string @{term "[a,b,c]"} from the value how the last derivative, @{term
   "r\<^sub>4"} in Fig~\ref{Sulz}, can match the empty string. Sulzmann and
   Lu achieve this by stepwise ``injecting back'' the characters into the
-  values thus inverting the operation of building derivatives on the level
+  values thus inverting the operation of building derivatives, but on the level
   of values. The corresponding function, called @{term inj}, takes three
   arguments, a regular expression, a character and a value. For example in
   the first (or right-most) @{term inj}-step in Fig~\ref{Sulz} the regular
@@ -473,11 +473,10 @@
   derivative step and @{term "v\<^sub>4"}, which is the value corresponding
   to the derivative regular expression @{term "r\<^sub>4"}. The result is
   the new value @{term "v\<^sub>3"}. The final result of the algorithm is
-  the value @{term "v\<^sub>1"} corresponding to the input regular
-  expression. The @{term inj} function is by recursion on the regular
+  the value @{term "v\<^sub>1"}. The @{term inj} function is defined by recursion on regular
   expressions and by analysing the shape of values (corresponding to 
   the derivative regular expressions).
-
+  %
   \begin{center}
   \begin{tabular}{l@ {\hspace{5mm}}lcl}
   (1) & @{thm (lhs) injval.simps(1)} & $\dn$ & @{thm (rhs) injval.simps(1)}\\
@@ -507,10 +506,10 @@
   @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} $\dn$ @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]}
   \end{center}
 
-  \noindent Consider first the else-branch where the derivative is @{term
+  \noindent Consider first the @{text "else"}-branch where the derivative is @{term
   "SEQ (der c r\<^sub>1) r\<^sub>2"}. The corresponding value must therefore
   be the form @{term "Seq v\<^sub>1 v\<^sub>2"}, which matches the left-hand
-  side in clause (4) of @{term inj}. In the if-branch the derivative is an
+  side in clause~(4) of @{term inj}. In the @{text "if"}-branch the derivative is an
   alternative, namely @{term "ALT (SEQ (der c r\<^sub>1) r\<^sub>2) (der c
   r\<^sub>2)"}. This means we either have to consider a @{text Left}- or
   @{text Right}-value. In case of the @{text Left}-value we know further it
@@ -523,7 +522,7 @@
   call @{const mkeps} in order to construct a value for how @{term "r\<^sub>1"}
   can match this empty string. A similar argument applies for why we can
   expect in the left-hand side of clause (7) that the value is of the form
-  @{term "Seq v (Stars vs)"}---the derivative of a star is @{term "SEQ r
+  @{term "Seq v (Stars vs)"}---the derivative of a star is @{term "SEQ (der c r)
   (STAR r)"}. Finally, the reason for why we can ignore the second argument
   in clause (1) of @{term inj} is that it will only ever be called in cases
   where @{term "c=d"}, but the usual linearity restrictions in patterns do
@@ -535,7 +534,7 @@
   The idea of the @{term inj}-function to ``inject'' a character, say
   @{term c}, into a value can be made precise by the first part of the
   following lemma, which shows that the underlying string of an injected
-  value has a prepend character @{term c}; the second part shows that the
+  value has a prepended character @{term c}; the second part shows that the
   underlying string of an @{const mkeps}-value is always the empty string
   (given the regular expression is nullable since otherwise @{text mkeps}
   might not be defined).
@@ -569,12 +568,12 @@
   \noindent If the regular expression does not match the string, @{const None} is
   returned, indicating an error is raised. If the regular expression \emph{does}
   match the string, then @{const Some} value is returned. One important
-  virtue of this algorithm is that it can be implemented with ease in a
+  virtue of this algorithm is that it can be implemented with ease in any
   functional programming language and also in Isabelle/HOL. In the remaining
   part of this section we prove that this algorithm is correct.
 
   The well-known idea of POSIX matching is informally defined by the longest
-  match and priority rule; as correctly argued in \cite{Sulzmann2014}, this
+  match and priority rule (see Introduction); as correctly argued in \cite{Sulzmann2014}, this
   needs formal specification. Sulzmann and Lu define a \emph{dominance}
   relation\footnote{Sulzmann and Lu call it an ordering relation, but
   without giving evidence that it is transitive.} between values and argue
@@ -623,14 +622,14 @@
   respectively. Consider now the third premise and note that the POSIX value
   of this rule should match the string @{term "s\<^sub>1 @ s\<^sub>2"}. According to the
   longest match rule, we want that the @{term "s\<^sub>1"} is the longest initial
-  split of @{term "s\<^sub>1 @ s\<^sub>2"} such that @{term "s\<^sub>2"} is still recognised
+  split of \mbox{@{term "s\<^sub>1 @ s\<^sub>2"}} such that @{term "s\<^sub>2"} is still recognised
   by @{term "r\<^sub>2"}. Let us assume, contrary to the third premise, that there
   \emph{exist} an @{term "s\<^sub>3"} and @{term "s\<^sub>4"} such that @{term "s\<^sub>2"}
-  can be split up into a non-empty string @{term "s\<^sub>3"} and possibly empty
+  can be split up into a non-empty string @{term "s\<^sub>3"} and a possibly empty
   string @{term "s\<^sub>4"}. Moreover the longer string @{term "s\<^sub>1 @ s\<^sub>3"} can be
   matched by @{text "r\<^sub>1"} and the shorter @{term "s\<^sub>4"} can still be
-  matched by @{term "r\<^sub>2"}. In this case @{term "s\<^sub>1"} would not be the
-  longest initial split of @{term "s\<^sub>1 @ s\<^sub>2"} and therefore @{term "Seq v\<^sub>1
+  matched by @{term "r\<^sub>2"}. In this case @{term "s\<^sub>1"} would \emph{not} be the
+  longest initial split of \mbox{@{term "s\<^sub>1 @ s\<^sub>2"}} and therefore @{term "Seq v\<^sub>1
   v\<^sub>2"} cannot be a POSIX value for @{term "(s\<^sub>1 @ s\<^sub>2, SEQ r\<^sub>1 r\<^sub>2)"}. 
   The main point is that this side-condition ensures the longest 
   match rule is satisfied.
Binary file thys/paper.pdf has changed