lexing: comparison thys/Paper/Paper.thy

equal deleted inserted replaced

-:23e68b81a908
+:2f043f8be9a9
 sequence of tokens, POSIX is the more natural disambiguation strategy for
 what programmers consider basic syntactic building blocks in their programs.
 These building blocks are often specified by some regular expressions, say
 @{text "r\<^bsub>key\<^esub>"} and @{text "r\<^bsub>id\<^esub>"} for recognising keywords and
 identifiers, respectively. There are two underlying (informal) rules behind
-tokenising a string in a POSIX fashion:
+tokenising a string in a POSIX fashion according to a collection of regular
+expressions:
 \begin{itemize}
 \item[$\bullet$] \underline{The Longest Match Rule (or ``maximal munch rule''):}
 The longest initial substring matched by any regular expression is taken as
 token, not a keyword followed by an identifier. For @{text "if"} we obtain by
 the priority rule a keyword token, not an identifier token---even if @{text
 "r\<^bsub>id\<^esub>"} matches also.\bigskip
 \noindent {\bf Contributions:} We have implemented in Isabelle/HOL the
-derivative-based regular expression matching algorithm as described by
+derivative-based regular expression matching algorithm of
 Sulzmann and Lu \cite{Sulzmann2014}. We have proved the correctness of this
 algorithm according to our specification of what a POSIX value is. Sulzmann
 and Lu sketch in \cite{Sulzmann2014} an informal correctness proof: but to
 us it contains unfillable gaps.\footnote{An extended version of
 \cite{Sulzmann2014} is available at the website of its first author; this
 implemented easily in functional languages. A bespoke lexer for the
 Imp-Language is formalised in Coq as part of the Software Foundations book
 \cite{Pierce2015}. The disadvantage of such bespoke lexers is that they
 do not generalise easily to more advanced features.
 Our formalisation is available from
+\url{http://www.inf.kcl.ac.uk/staff/urbanc/lex}.
-\begin{center}
-\url{http://www.inf.kcl.ac.uk/staff/urbanc/lex}
-\end{center}
 %\noindent
 %{\bf Acknowledgements:}
 %We are grateful for the comments we received from anonymous
 %referees.

changeset 134	2f043f8be9a9
parent 133	23e68b81a908
child 135	fee5641c5994