updated
authorChristian Urban <christian dot urban at kcl dot ac dot uk>
Wed, 16 Mar 2016 07:15:12 +0000
changeset 152 e3eb82ea2244
parent 151 5a1196466a9c
child 153 69ec99b14ac9
updated
thys/Paper/Paper.thy
thys/Paper/document/root.tex
thys/paper.pdf
--- a/thys/Paper/Paper.thy	Tue Mar 15 01:10:38 2016 +0000
+++ b/thys/Paper/Paper.thy	Wed Mar 16 07:15:12 2016 +0000
@@ -83,8 +83,8 @@
 and easily definable and reasoned about in theorem provers---the definitions
 just consist of inductive datatypes and simple recursive functions. A
 completely formalised correctness proof of this matcher in for example HOL4
-has been mentioned in~\cite{Owens2008}. Another one in Isabelle/HOL is part
-of the work in \cite{Krauss2011}.
+has been mentioned by Owens and Slind~\cite{Owens2008}. Another one in Isabelle/HOL is part
+of the work by Krauss and Nipkow \cite{Krauss2011}.
 
 One limitation of Brzozowski's matcher is that it only generates a YES/NO
 answer for whether a string is being matched by a regular expression.
@@ -106,11 +106,11 @@
 observations that, without evidence that it is transitive, it cannot be
 called an ``order relation'', and that the relation is called a ``total
 order'' despite being evidently not total\footnote{The relation @{text
-"\<ge>\<^bsub>r\<^esub>"} defined in \cite{Sulzmann2014} is a relation on the
+"\<ge>\<^bsub>r\<^esub>"} defined by Sulzmann and Lu \cite{Sulzmann2014} is a relation on the
 values for the regular expression @{term r}; but it only holds between
-@{term v} and @{term "v'"} in cases where @{term v} and @{term "v'"} have
+@{term "v\<^sub>1"} and @{term "v\<^sub>2"} in cases where @{term "v\<^sub>1"} and @{term "v\<^sub>2"} have
 the same flattening (underlying string). So a counterexample to totality is
-given by taking two values @{term v} and @{term "v'"} for @{term r} that
+given by taking two values @{term "v\<^sub>1"} and @{term "v\<^sub>2"} for @{term r} that
 have different flattenings (see Section~\ref{posixsec}). A different
 relation @{text "\<ge>\<^bsub>r,s\<^esub>"} on the set of values for @{term r}
 with flattening @{term s} is definable by the same approach, and is indeed
@@ -123,7 +123,7 @@
 value is unique. Proofs are both done by hand and checked in Isabelle/HOL.
 The experience of doing our proofs has been that this mechanical checking
 was absolutely essential: this subject area has hidden snares. This was also
-noted by Kuklewitz \cite{Kuklewicz} who found that nearly all POSIX matching
+noted by Kuklewicz \cite{Kuklewicz} who found that nearly all POSIX matching
 implementations are ``buggy'' \cite[Page 203]{Sulzmann2014}.
 
 If a regular expression matches a string, then in general there is more than
@@ -150,12 +150,12 @@
 expressions:
 
 \begin{itemize} 
-\item[$\bullet$] \underline{The Longest Match Rule (or ``maximal munch rule''):}
+\item[$\bullet$] \underline{The Longest Match Rule (or ``maximal munch rule''):}\smallskip
 
 The longest initial substring matched by any regular expression is taken as
 next token.\smallskip
 
-\item[$\bullet$] \underline{Priority Rule:}
+\item[$\bullet$] \underline{Priority Rule:}\smallskip
 
 For a particular longest initial substring, the first regular expression
 that can match determines the token.
@@ -333,7 +333,7 @@
 
 text {* 
 
-  The clever idea in \cite{Sulzmann2014} is to introduce values for encoding
+  The clever idea by Sulzmann and Lu \cite{Sulzmann2014} is to introduce values for encoding
   \emph{how} a regular expression matches a string and then define a
   function on values that mirrors (but inverts) the construction of the
   derivative on regular expressions. \emph{Values} are defined as the
@@ -349,11 +349,12 @@
   @{term "Stars vs"} 
   \end{center}  
 
-  \noindent where we use @{term vs} to stand for a list of values. (This is
-  similar to the approach taken by Frisch and Cardelli for GREEDY matching
-  \cite{Frisch2004}, and Sulzmann and Lu \cite{Sulzmann2014} for POSIX
-  matching). The string underlying a value can be calculated by the @{const
-  flat} function, written @{term "flat DUMMY"} and defined as:
+  \noindent where we use @{term vs} to stand for a list of
+  values. (This is similar to the approach taken by Frisch and
+  Cardelli for GREEDY matching \cite{Frisch2004}, and Sulzmann and Lu
+  for POSIX matching \cite{Sulzmann2014}). The string underlying a
+  value can be calculated by the @{const flat} function, written
+  @{term "flat DUMMY"} and defined as:
 
   \begin{center}
   \begin{tabular}{lcl}
@@ -586,7 +587,7 @@
   In contrast, we shall introduce a simple inductive definition that
   specifies directly what a \emph{POSIX value} is, incorporating the
   POSIX-specific choices into the side-conditions of our rules. Our
-  definition is inspired by the matching relation given in
+  definition is inspired by the matching relation given by Vansummeren
   \cite{Vansummeren2006}. The relation we define is ternary and written as
   \mbox{@{term "s \<in> r \<rightarrow> v"}}, relating strings, regular expressions and
   values.
@@ -1096,12 +1097,13 @@
 
 text {*
 
-  We have implemented the POSIX value calculation algorithm introduced in
+  We have implemented the POSIX value calculation algorithm introduced by
+  Sulzmann and Lu
   \cite{Sulzmann2014}. Our implementation is nearly identical to the
   original and all modifications we introduced are harmless (like our char-clause for
   @{text inj}). We have proved this algorithm to be correct, but correct
   according to our own specification of what POSIX values are. Our
-  specification (inspired from work in \cite{Vansummeren2006}) appears to be
+  specification (inspired from work by Vansummeren \cite{Vansummeren2006}) appears to be
   much simpler than in \cite{Sulzmann2014} and our proofs are nearly always
   straightforward. We have attempted to formalise the original proof
   by Sulzmann and Lu \cite{Sulzmann2014}, but we believe it contains
@@ -1129,15 +1131,15 @@
   computational. The algorithm by Sulzmann and Lu, in contrast, can be
   implemented with easy in any functional language. A bespoke lexer for the
   Imp-language is formalised in Coq as part of the Software Foundations book
-  \cite{Pierce2015}. The disadvantage of such bespoke lexers is that they
+  by Pierce et al\cite{Pierce2015}. The disadvantage of such bespoke lexers is that they
   do not generalise easily to more advanced features.
   Our formalisation is available from
-  \url{http://www.inf.kcl.ac.uk/staff/urbanc/lex}.
+  \url{http://www.inf.kcl.ac.uk/staff/urbanc/lex}.\medskip
 
-  %\noindent
-  %{\bf Acknowledgements:}
-  %We are grateful for the comments we received from anonymous
-  %referees.
+  \noindent
+  {\bf Acknowledgements:}
+  We are very grateful to Martin Sulzmann for his comments on our work and 
+  also patiently explaining to us the details in \cite{Sulzmann2014}.
   \small
   \bibliographystyle{plain}
   \bibliography{root}
--- a/thys/Paper/document/root.tex	Tue Mar 15 01:10:38 2016 +0000
+++ b/thys/Paper/document/root.tex	Wed Mar 16 07:15:12 2016 +0000
@@ -26,7 +26,7 @@
 \renewcommand{\isasymequiv}{$\dn$}
 \renewcommand{\isasymemptyset}{$\varnothing$}
 \renewcommand{\isacharunderscore}{\mbox{$\_\!\_$}}
-\renewcommand{\isacharprime}{\mbox{$\mbox{}\!\!\mbox{$^\prime$}$}}
+%%\renewcommand{\isacharprime}{\makebox[0mm]{$\mbox{}\mbox{$\,^\prime$}$}}
 
 \definecolor{mygrey}{rgb}{.80,.80,.80}
 \def\Brz{Brzozowski}
Binary file thys/paper.pdf has changed