diff -r fc22ca36325c -r f1d800062d4f thys/Paper/Paper.thy
--- a/thys/Paper/Paper.thy	Mon May 09 15:01:31 2016 +0100
+++ b/thys/Paper/Paper.thy	Wed May 11 12:20:16 2016 +0100
@@ -102,50 +102,6 @@
 of the work by Krauss and Nipkow \cite{Krauss2011}. And another one in Coq is given
 by Coquand and Siles \cite{Coquand2012}.
 
-One limitation of Brzozowski's matcher is that it only generates a
-YES/NO answer for whether a string is being matched by a regular
-expression.  Sulzmann and Lu \cite{Sulzmann2014} extended this matcher
-to allow generation not just of a YES/NO answer but of an actual
-matching, called a [lexical] {\em value}. They give a simple algorithm
-to calculate a value that appears to be the value associated with
-POSIX matching. The challenge then is to specify that value, in an
-algorithm-independent fashion, and to show that Sulzmann and Lu's
-derivative-based algorithm does indeed calculate a value that is
-correct according to the specification.
-
-The answer given by Sulzmann and Lu \cite{Sulzmann2014} is to define a
-relation (called an ``order relation'') on the set of values of @{term
-r}, and to show that (once a string to be matched is chosen) there is
-a maximum element and that it is computed by their derivative-based
-algorithm. This proof idea is inspired by work of Frisch and Cardelli
-\cite{Frisch2004} on a GREEDY regular expression matching
-algorithm. However, we were not able to establish transitivity and
-totality for the ``order relation'' by Sulzmann and Lu. In Section
-\ref{argu} we identify some inherent problems with their approach (of
-which some of the proofs are not published in \cite{Sulzmann2014});
-perhaps more importantly, we give a simple inductive (and
-algorithm-independent) definition of what we call being a {\em POSIX
-value} for a regular expression @{term r} and a string @{term s}; we
-show that the algorithm computes such a value and that such a value is
-unique. Proofs are both done by hand and checked in Isabelle/HOL.  The
-experience of doing our proofs has been that this mechanical checking
-was absolutely essential: this subject area has hidden snares. This
-was also noted by Kuklewicz \cite{Kuklewicz} who found that nearly all
-POSIX matching implementations are ``buggy'' \cite[Page
-203]{Sulzmann2014}.
-
-%\footnote{The relation @{text "\<ge>\<^bsub>r\<^esub>"} defined by Sulzmann and Lu \cite{Sulzmann2014} 
-%is a relation on the
-%values for the regular expression @{term r}; but it only holds between
-%@{term "v\<^sub>1"} and @{term "v\<^sub>2"} in cases where @{term "v\<^sub>1"} and @{term "v\<^sub>2"} have
-%the same flattening (underlying string). So a counterexample to totality is
-%given by taking two values @{term "v\<^sub>1"} and @{term "v\<^sub>2"} for @{term r} that
-%have different flattenings (see Section~\ref{posixsec}). A different
-%relation @{text "\<ge>\<^bsub>r,s\<^esub>"} on the set of values for @{term r}
-%with flattening @{term s} is definable by the same approach, and is indeed
-%total; but that is not what Proposition 1 of \cite{Sulzmann2014} does.}
-
-
 If a regular expression matches a string, then in general there is more than
 one way of how the string is matched. There are two commonly used
 disambiguation strategies to generate a unique answer: one is called GREEDY
@@ -170,26 +126,76 @@
 expressions:
 
 \begin{itemize} 
-\item[$\bullet$] \underline{The Longest Match Rule (or ``maximal munch rule''):}\smallskip
-
+\item[$\bullet$] \emph{The Longest Match Rule} (or \emph{``maximal munch rule''}):
 The longest initial substring matched by any regular expression is taken as
 next token.\smallskip
 
-\item[$\bullet$] \underline{Priority Rule:}\smallskip
-
+\item[$\bullet$] \emph{Priority Rule:}
 For a particular longest initial substring, the first regular expression
 that can match determines the token.
 \end{itemize}
- 
-\noindent Consider for example @{text "r\<^bsub>key\<^esub>"} recognising keywords such as
-@{text "if"}, @{text "then"} and so on; and @{text "r\<^bsub>id\<^esub>"} recognising
-identifiers (say, a single character followed by characters or numbers).
-Then we can form the regular expression @{text "(r\<^bsub>key\<^esub> + r\<^bsub>id\<^esub>)\<^sup>\<star>"} and use
-POSIX matching to tokenise strings, say @{text "iffoo"} and @{text "if"}.
-For @{text "iffoo"} we obtain by the longest match rule a single identifier
-token, not a keyword followed by an identifier. For @{text "if"} we obtain by
-the priority rule a keyword token, not an identifier token---even if @{text
-"r\<^bsub>id\<^esub>"} matches also.\bigskip
+
+\noindent Consider for example @{text "r\<^bsub>key\<^esub>"} recognising keywords
+such as @{text "if"}, @{text "then"} and so on; and @{text "r\<^bsub>id\<^esub>"}
+recognising identifiers (say, a single character followed by
+characters or numbers).  Then we can form the regular expression
+@{text "(r\<^bsub>key\<^esub> + r\<^bsub>id\<^esub>)\<^sup>\<star>"} and use POSIX matching to tokenise strings,
+say @{text "iffoo"} and @{text "if"}.  For @{text "iffoo"} we obtain
+by the Longest Match Rule a single identifier token, not a keyword
+followed by an identifier. For @{text "if"} we obtain by the Priority
+Rule a keyword token, not an identifier token---even if @{text "r\<^bsub>id\<^esub>"}
+matches also.
+
+One limitation of Brzozowski's matcher is that it only generates a
+YES/NO answer for whether a string is being matched by a regular
+expression.  Sulzmann and Lu~\cite{Sulzmann2014} extended this matcher
+to allow generation not just of a YES/NO answer but of an actual
+matching, called a [lexical] {\em value}. They give a simple algorithm
+to calculate a value that appears to be the value associated with
+POSIX matching.  The challenge then is to specify that value, in an
+algorithm-independent fashion, and to show that Sulzmann and Lu's
+derivative-based algorithm does indeed calculate a value that is
+correct according to the specification.
+
+The answer given by Sulzmann and Lu \cite{Sulzmann2014} is to define a
+relation (called an ``order relation'') on the set of values of @{term
+r}, and to show that (once a string to be matched is chosen) there is
+a maximum element and that it is computed by their derivative-based
+algorithm. This proof idea is inspired by work of Frisch and Cardelli
+\cite{Frisch2004} on a GREEDY regular expression matching
+algorithm. However, we were not able to establish transitivity and
+totality for the ``order relation'' by Sulzmann and Lu. In Section
+\ref{argu} we identify some inherent problems with their approach (of
+which some of the proofs are not published in \cite{Sulzmann2014});
+perhaps more importantly, we give a simple inductive (and
+algorithm-independent) definition of what we call being a {\em POSIX
+value} for a regular expression @{term r} and a string @{term s}; we
+show that the algorithm computes such a value and that such a value is
+unique. Our proofs are both done by hand and checked in Isabelle/HOL.  The
+experience of doing our proofs has been that this mechanical checking
+was absolutely essential: this subject area has hidden snares. This
+was also noted by Kuklewicz \cite{Kuklewicz} who found that nearly all
+POSIX matching implementations are ``buggy'' \cite[Page
+203]{Sulzmann2014} and by Grathwohl et al \cite[Page 36]{CrashCourse2014}
+who wrote:
+
+\begin{quote}
+\it{}``The POSIX strategy is more complicated than the greedy because of 
+the dependence on information about the length of matched strings in the 
+various subexpressions.''
+\end{quote}
+
+%\footnote{The relation @{text "\<ge>\<^bsub>r\<^esub>"} defined by Sulzmann and Lu \cite{Sulzmann2014} 
+%is a relation on the
+%values for the regular expression @{term r}; but it only holds between
+%@{term "v\<^sub>1"} and @{term "v\<^sub>2"} in cases where @{term "v\<^sub>1"} and @{term "v\<^sub>2"} have
+%the same flattening (underlying string). So a counterexample to totality is
+%given by taking two values @{term "v\<^sub>1"} and @{term "v\<^sub>2"} for @{term r} that
+%have different flattenings (see Section~\ref{posixsec}). A different
+%relation @{text "\<ge>\<^bsub>r,s\<^esub>"} on the set of values for @{term r}
+%with flattening @{term s} is definable by the same approach, and is indeed
+%total; but that is not what Proposition 1 of \cite{Sulzmann2014} does.}
+
 
 \noindent {\bf Contributions:} We have implemented in Isabelle/HOL the
 derivative-based regular expression matching algorithm of
@@ -239,10 +245,13 @@
   recursive function @{term L} with the clauses:
 
   \begin{center}
-  \begin{tabular}{l@ {\hspace{5mm}}rcl}
+  \begin{tabular}{l@ {\hspace{3mm}}rcl}
   (1) & @{thm (lhs) L.simps(1)} & $\dn$ & @{thm (rhs) L.simps(1)}\\
   (2) & @{thm (lhs) L.simps(2)} & $\dn$ & @{thm (rhs) L.simps(2)}\\
   (3) & @{thm (lhs) L.simps(3)} & $\dn$ & @{thm (rhs) L.simps(3)}\\
+  \end{tabular}
+  \hspace{14mm}
+  \begin{tabular}{l@ {\hspace{3mm}}rcl}
   (4) & @{thm (lhs) L.simps(4)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) L.simps(4)[of "r\<^sub>1" "r\<^sub>2"]}\\
   (5) & @{thm (lhs) L.simps(5)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) L.simps(5)[of "r\<^sub>1" "r\<^sub>2"]}\\
   (6) & @{thm (lhs) L.simps(6)} & $\dn$ & @{thm (rhs) L.simps(6)}\\
@@ -355,11 +364,11 @@
 
 text {* 
 
-  The clever idea by Sulzmann and Lu \cite{Sulzmann2014} is to introduce values for encoding
-  \emph{how} a regular expression matches a string and then define a
-  function on values that mirrors (but inverts) the construction of the
-  derivative on regular expressions. \emph{Values} are defined as the
-  inductive datatype
+  The clever idea by Sulzmann and Lu \cite{Sulzmann2014} is to define
+  values for encoding \emph{how} a regular expression matches a string
+  and then define a function on values that mirrors (but inverts) the
+  construction of the derivative on regular expressions. \emph{Values}
+  are defined as the inductive datatype
 
   \begin{center}
   @{text "v :="}
@@ -379,11 +388,13 @@
   @{term "flat DUMMY"} and defined as:
 
   \begin{center}
-  \begin{tabular}{lcl}
+  \begin{tabular}[t]{lcl}
   @{thm (lhs) flat.simps(1)} & $\dn$ & @{thm (rhs) flat.simps(1)}\\
   @{thm (lhs) flat.simps(2)} & $\dn$ & @{thm (rhs) flat.simps(2)}\\
   @{thm (lhs) flat.simps(3)} & $\dn$ & @{thm (rhs) flat.simps(3)}\\
-  @{thm (lhs) flat.simps(4)} & $\dn$ & @{thm (rhs) flat.simps(4)}\\
+  @{thm (lhs) flat.simps(4)} & $\dn$ & @{thm (rhs) flat.simps(4)}
+  \end{tabular}\hspace{14mm}
+  \begin{tabular}[t]{lcl}
   @{thm (lhs) flat.simps(5)[of "v\<^sub>1" "v\<^sub>2"]} & $\dn$ & @{thm (rhs) flat.simps(5)[of "v\<^sub>1" "v\<^sub>2"]}\\
   @{thm (lhs) flat.simps(6)} & $\dn$ & @{thm (rhs) flat.simps(6)}\\
   @{thm (lhs) flat.simps(7)} & $\dn$ & @{thm (rhs) flat.simps(7)}\\
@@ -395,11 +406,12 @@
 
   \begin{center}
   \begin{tabular}{c}
+  \\[-8mm]
   @{thm[mode=Axiom] Prf.intros(4)} \qquad
-  @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\smallskip\\
+  @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\\[3mm]
   @{thm[mode=Rule] Prf.intros(2)[of "v\<^sub>1" "r\<^sub>1" "r\<^sub>2"]} \qquad 
-  @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\smallskip\\
-  @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\smallskip\\ 
+  @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\\[3mm]
+  @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\\[3mm]
   @{thm[mode=Axiom] Prf.intros(6)[of "r"]} \qquad  
   @{thm[mode=Rule] Prf.intros(7)[of "v" "r" "vs"]}
   \end{tabular}