diff -r e2828c4a1e23 -r d36be1e356c0 thys/Journal/document/root.tex --- a/thys/Journal/document/root.tex Thu Jul 06 16:05:33 2017 +0100 +++ b/thys/Journal/document/root.tex Tue Jul 18 18:39:20 2017 +0100 @@ -33,8 +33,17 @@ \newtheorem{conject}{Conjecture} \begin{document} +\renewcommand{\thefootnote}{$\star$} \footnotetext[1]{This paper is a + revised and expanded version of \cite{AusafDyckhoffUrban2016}. + Compared with that paper we give a second definition for POSIX + values and prove that it is equivalent to the original one. This + definition is based on an ordering of values and very similar to the + definition given by Sulzmann and Lu~\cite{Sulzmann2014}. We also + extend our results to additional constructors of regular + expressions.} \renewcommand{\thefootnote}{\arabic{footnote}} -\title{POSIX {L}exing with {D}erivatives of {R}egular {E}xpressions (Proof Pearl)} + +\title{POSIX {L}exing with {D}erivatives of {R}egular {E}xpressions} \author{Fahad Ausaf\inst{1} \and Roy Dyckhoff\inst{2} \and Christian Urban\inst{3}} \institute{King's College London\\ \email{fahad.ausaf@icloud.com} @@ -50,20 +59,19 @@ expressions. They can be used for a very simple regular expression matching algorithm. Sulzmann and Lu cleverly extended this algorithm in order to deal with POSIX matching, which is the underlying -disambiguation strategy for regular expressions needed in lexers. -Sulzmann and Lu have made available on-line what they call a -``rigorous proof'' of the correctness of their algorithm w.r.t.\ their -specification; regrettably, it appears to us to have unfillable gaps. -In the first part of this paper we give our inductive definition of -what a POSIX value is and show $(i)$ that such a value is unique (for -given regular expression and string being matched) and $(ii)$ that -Sulzmann and Lu's algorithm always generates such a value (provided -that the regular expression matches the string). We also prove the -correctness of an optimised version of the POSIX matching -algorithm. Our definitions and proof are much simpler than those by -Sulzmann and Lu and can be easily formalised in Isabelle/HOL. In the -second part we analyse the correctness argument by Sulzmann and Lu and -explain why the gaps in this argument cannot be filled easily.\smallskip +disambiguation strategy for regular expressions needed in lexers. In +the first part of this paper we give our inductive definition of what +a POSIX value is and show $(i)$ that such a value is unique (for given +regular expression and string being matched) and $(ii)$ that Sulzmann +and Lu's algorithm always generates such a value (provided that the +regular expression matches the string). We also prove the correctness +of an optimised version of the POSIX matching algorithm. In the +second part we show that $(iii)$ our inductive definition of a POSIX +value is equivalent to an alternative definition by Okui and Suzuki +which identifies a POSIX value as least element according to an +ordering of values. The advantage of the definition based on the +ordering is that it implements more directly the POSIX +longest-leftmost matching semantics.\smallskip {\bf Keywords:} POSIX matching, Derivatives of Regular Expressions, Isabelle/HOL