# HG changeset patch # User Christian Urban # Date 1457433635 0 # Node ID 2f043f8be9a9c64ebc20fa64d45f7fbb0ccf9723 # Parent 23e68b81a908e8fee55341f6c66f03facbab78f8 updated diff -r 23e68b81a908 -r 2f043f8be9a9 thys/Paper/Paper.thy --- a/thys/Paper/Paper.thy Tue Mar 08 10:21:18 2016 +0000 +++ b/thys/Paper/Paper.thy Tue Mar 08 10:40:35 2016 +0000 @@ -141,7 +141,8 @@ These building blocks are often specified by some regular expressions, say @{text "r\<^bsub>key\<^esub>"} and @{text "r\<^bsub>id\<^esub>"} for recognising keywords and identifiers, respectively. There are two underlying (informal) rules behind -tokenising a string in a POSIX fashion: +tokenising a string in a POSIX fashion according to a collection of regular +expressions: \begin{itemize} \item[$\bullet$] \underline{The Longest Match Rule (or ``maximal munch rule''):} @@ -166,7 +167,7 @@ "r\<^bsub>id\<^esub>"} matches also.\bigskip \noindent {\bf Contributions:} We have implemented in Isabelle/HOL the -derivative-based regular expression matching algorithm as described by +derivative-based regular expression matching algorithm of Sulzmann and Lu \cite{Sulzmann2014}. We have proved the correctness of this algorithm according to our specification of what a POSIX value is. Sulzmann and Lu sketch in \cite{Sulzmann2014} an informal correctness proof: but to @@ -1135,10 +1136,7 @@ \cite{Pierce2015}. The disadvantage of such bespoke lexers is that they do not generalise easily to more advanced features. Our formalisation is available from - - \begin{center} - \url{http://www.inf.kcl.ac.uk/staff/urbanc/lex} - \end{center} + \url{http://www.inf.kcl.ac.uk/staff/urbanc/lex}. %\noindent %{\bf Acknowledgements:} diff -r 23e68b81a908 -r 2f043f8be9a9 thys/paper.pdf Binary file thys/paper.pdf has changed