# HG changeset patch # User Christian Urban # Date 1457435352 0 # Node ID fa0d8aa5d7de2275126b9ede9c9977780e3a8d7f # Parent fee5641c5994cf8bb3f30f7408b806610ddb6931 updated diff -r fee5641c5994 -r fa0d8aa5d7de thys/Paper/Paper.thy --- a/thys/Paper/Paper.thy Tue Mar 08 10:46:43 2016 +0000 +++ b/thys/Paper/Paper.thy Tue Mar 08 11:09:12 2016 +0000 @@ -368,12 +368,12 @@ \begin{center} \begin{tabular}{c} @{thm[mode=Axiom] Prf.intros(4)} \qquad - @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\medskip\\ + @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\smallskip\\ @{thm[mode=Rule] Prf.intros(2)[of "v\<^sub>1" "r\<^sub>1" "r\<^sub>2"]} \qquad - @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\medskip\\ - @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\medskip\\ + @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\smallskip\\ + @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\smallskip\\ @{thm[mode=Axiom] Prf.intros(6)[of "r"]} \qquad - @{thm[mode=Rule] Prf.intros(7)[of "v" "r" "vs"]}\medskip\\ + @{thm[mode=Rule] Prf.intros(7)[of "v" "r" "vs"]} \end{tabular} \end{center} @@ -402,7 +402,7 @@ whether the resulting derivative regular expression @{term "r\<^sub>4"} can match the empty string. If yes, we call the function @{const mkeps} that produces a value @{term "v\<^sub>4"} for how @{term "r\<^sub>4"} can - match the empty string (taking into account the POSIX rules in case + match the empty string (taking into account the POSIX constraints in case there are several ways). This functions is defined by the clauses: \begin{figure}[t] @@ -430,12 +430,12 @@ \end{center} \caption{The two phases of the algorithm by Sulzmann \& Lu \cite{Sulzmann2014}, matching the string @{term "[a,b,c]"}. The first phase (the arrows from -left to right) is \Brz's matcher building succesive derivatives. If the +left to right) is \Brz's matcher building successive derivatives. If the last regular expression is @{term nullable}, then the functions of the second phase are called (the top-down and right-to-left arrows): first @{term mkeps} calculates a value witnessing how the empty string has been recognised by @{term "r\<^sub>4"}. After -that the function @{term inj} `injects back' the characters of the string into +that the function @{term inj} ``injects back'' the characters of the string into the values. \label{Sulz}} \end{figure} @@ -465,7 +465,7 @@ string @{term "[a,b,c]"} from the value how the last derivative, @{term "r\<^sub>4"} in Fig~\ref{Sulz}, can match the empty string. Sulzmann and Lu achieve this by stepwise ``injecting back'' the characters into the - values thus inverting the operation of building derivatives on the level + values thus inverting the operation of building derivatives, but on the level of values. The corresponding function, called @{term inj}, takes three arguments, a regular expression, a character and a value. For example in the first (or right-most) @{term inj}-step in Fig~\ref{Sulz} the regular @@ -473,11 +473,10 @@ derivative step and @{term "v\<^sub>4"}, which is the value corresponding to the derivative regular expression @{term "r\<^sub>4"}. The result is the new value @{term "v\<^sub>3"}. The final result of the algorithm is - the value @{term "v\<^sub>1"} corresponding to the input regular - expression. The @{term inj} function is by recursion on the regular + the value @{term "v\<^sub>1"}. The @{term inj} function is defined by recursion on regular expressions and by analysing the shape of values (corresponding to the derivative regular expressions). - + % \begin{center} \begin{tabular}{l@ {\hspace{5mm}}lcl} (1) & @{thm (lhs) injval.simps(1)} & $\dn$ & @{thm (rhs) injval.simps(1)}\\ @@ -507,10 +506,10 @@ @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} $\dn$ @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} \end{center} - \noindent Consider first the else-branch where the derivative is @{term + \noindent Consider first the @{text "else"}-branch where the derivative is @{term "SEQ (der c r\<^sub>1) r\<^sub>2"}. The corresponding value must therefore be the form @{term "Seq v\<^sub>1 v\<^sub>2"}, which matches the left-hand - side in clause (4) of @{term inj}. In the if-branch the derivative is an + side in clause~(4) of @{term inj}. In the @{text "if"}-branch the derivative is an alternative, namely @{term "ALT (SEQ (der c r\<^sub>1) r\<^sub>2) (der c r\<^sub>2)"}. This means we either have to consider a @{text Left}- or @{text Right}-value. In case of the @{text Left}-value we know further it @@ -523,7 +522,7 @@ call @{const mkeps} in order to construct a value for how @{term "r\<^sub>1"} can match this empty string. A similar argument applies for why we can expect in the left-hand side of clause (7) that the value is of the form - @{term "Seq v (Stars vs)"}---the derivative of a star is @{term "SEQ r + @{term "Seq v (Stars vs)"}---the derivative of a star is @{term "SEQ (der c r) (STAR r)"}. Finally, the reason for why we can ignore the second argument in clause (1) of @{term inj} is that it will only ever be called in cases where @{term "c=d"}, but the usual linearity restrictions in patterns do @@ -535,7 +534,7 @@ The idea of the @{term inj}-function to ``inject'' a character, say @{term c}, into a value can be made precise by the first part of the following lemma, which shows that the underlying string of an injected - value has a prepend character @{term c}; the second part shows that the + value has a prepended character @{term c}; the second part shows that the underlying string of an @{const mkeps}-value is always the empty string (given the regular expression is nullable since otherwise @{text mkeps} might not be defined). @@ -569,12 +568,12 @@ \noindent If the regular expression does not match the string, @{const None} is returned, indicating an error is raised. If the regular expression \emph{does} match the string, then @{const Some} value is returned. One important - virtue of this algorithm is that it can be implemented with ease in a + virtue of this algorithm is that it can be implemented with ease in any functional programming language and also in Isabelle/HOL. In the remaining part of this section we prove that this algorithm is correct. The well-known idea of POSIX matching is informally defined by the longest - match and priority rule; as correctly argued in \cite{Sulzmann2014}, this + match and priority rule (see Introduction); as correctly argued in \cite{Sulzmann2014}, this needs formal specification. Sulzmann and Lu define a \emph{dominance} relation\footnote{Sulzmann and Lu call it an ordering relation, but without giving evidence that it is transitive.} between values and argue @@ -623,14 +622,14 @@ respectively. Consider now the third premise and note that the POSIX value of this rule should match the string @{term "s\<^sub>1 @ s\<^sub>2"}. According to the longest match rule, we want that the @{term "s\<^sub>1"} is the longest initial - split of @{term "s\<^sub>1 @ s\<^sub>2"} such that @{term "s\<^sub>2"} is still recognised + split of \mbox{@{term "s\<^sub>1 @ s\<^sub>2"}} such that @{term "s\<^sub>2"} is still recognised by @{term "r\<^sub>2"}. Let us assume, contrary to the third premise, that there \emph{exist} an @{term "s\<^sub>3"} and @{term "s\<^sub>4"} such that @{term "s\<^sub>2"} - can be split up into a non-empty string @{term "s\<^sub>3"} and possibly empty + can be split up into a non-empty string @{term "s\<^sub>3"} and a possibly empty string @{term "s\<^sub>4"}. Moreover the longer string @{term "s\<^sub>1 @ s\<^sub>3"} can be matched by @{text "r\<^sub>1"} and the shorter @{term "s\<^sub>4"} can still be - matched by @{term "r\<^sub>2"}. In this case @{term "s\<^sub>1"} would not be the - longest initial split of @{term "s\<^sub>1 @ s\<^sub>2"} and therefore @{term "Seq v\<^sub>1 + matched by @{term "r\<^sub>2"}. In this case @{term "s\<^sub>1"} would \emph{not} be the + longest initial split of \mbox{@{term "s\<^sub>1 @ s\<^sub>2"}} and therefore @{term "Seq v\<^sub>1 v\<^sub>2"} cannot be a POSIX value for @{term "(s\<^sub>1 @ s\<^sub>2, SEQ r\<^sub>1 r\<^sub>2)"}. The main point is that this side-condition ensures the longest match rule is satisfied. diff -r fee5641c5994 -r fa0d8aa5d7de thys/paper.pdf Binary file thys/paper.pdf has changed