31 \def\der{\backslash} |
31 \def\der{\backslash} |
32 \newtheorem{falsehood}{Falsehood} |
32 \newtheorem{falsehood}{Falsehood} |
33 \newtheorem{conject}{Conjecture} |
33 \newtheorem{conject}{Conjecture} |
34 |
34 |
35 \begin{document} |
35 \begin{document} |
|
36 \renewcommand{\thefootnote}{$\star$} \footnotetext[1]{This paper is a |
|
37 revised and expanded version of \cite{AusafDyckhoffUrban2016}. |
|
38 Compared with that paper we give a second definition for POSIX |
|
39 values and prove that it is equivalent to the original one. This |
|
40 definition is based on an ordering of values and very similar to the |
|
41 definition given by Sulzmann and Lu~\cite{Sulzmann2014}. We also |
|
42 extend our results to additional constructors of regular |
|
43 expressions.} \renewcommand{\thefootnote}{\arabic{footnote}} |
36 |
44 |
37 \title{POSIX {L}exing with {D}erivatives of {R}egular {E}xpressions (Proof Pearl)} |
45 |
|
46 \title{POSIX {L}exing with {D}erivatives of {R}egular {E}xpressions} |
38 \author{Fahad Ausaf\inst{1} \and Roy Dyckhoff\inst{2} \and Christian Urban\inst{3}} |
47 \author{Fahad Ausaf\inst{1} \and Roy Dyckhoff\inst{2} \and Christian Urban\inst{3}} |
39 \institute{King's College London\\ |
48 \institute{King's College London\\ |
40 \email{fahad.ausaf@icloud.com} |
49 \email{fahad.ausaf@icloud.com} |
41 \and University of St Andrews\\ |
50 \and University of St Andrews\\ |
42 \email{roy.dyckhoff@st-andrews.ac.uk} |
51 \email{roy.dyckhoff@st-andrews.ac.uk} |
48 |
57 |
49 Brzozowski introduced the notion of derivatives for regular |
58 Brzozowski introduced the notion of derivatives for regular |
50 expressions. They can be used for a very simple regular expression |
59 expressions. They can be used for a very simple regular expression |
51 matching algorithm. Sulzmann and Lu cleverly extended this algorithm |
60 matching algorithm. Sulzmann and Lu cleverly extended this algorithm |
52 in order to deal with POSIX matching, which is the underlying |
61 in order to deal with POSIX matching, which is the underlying |
53 disambiguation strategy for regular expressions needed in lexers. |
62 disambiguation strategy for regular expressions needed in lexers. In |
54 Sulzmann and Lu have made available on-line what they call a |
63 the first part of this paper we give our inductive definition of what |
55 ``rigorous proof'' of the correctness of their algorithm w.r.t.\ their |
64 a POSIX value is and show $(i)$ that such a value is unique (for given |
56 specification; regrettably, it appears to us to have unfillable gaps. |
65 regular expression and string being matched) and $(ii)$ that Sulzmann |
57 In the first part of this paper we give our inductive definition of |
66 and Lu's algorithm always generates such a value (provided that the |
58 what a POSIX value is and show $(i)$ that such a value is unique (for |
67 regular expression matches the string). We also prove the correctness |
59 given regular expression and string being matched) and $(ii)$ that |
68 of an optimised version of the POSIX matching algorithm. In the |
60 Sulzmann and Lu's algorithm always generates such a value (provided |
69 second part we show that $(iii)$ our inductive definition of a POSIX |
61 that the regular expression matches the string). We also prove the |
70 value is equivalent to an alternative definition by Okui and Suzuki |
62 correctness of an optimised version of the POSIX matching |
71 which identifies a POSIX value as least element according to an |
63 algorithm. Our definitions and proof are much simpler than those by |
72 ordering of values. The advantage of the definition based on the |
64 Sulzmann and Lu and can be easily formalised in Isabelle/HOL. In the |
73 ordering is that it implements more directly the POSIX |
65 second part we analyse the correctness argument by Sulzmann and Lu and |
74 longest-leftmost matching semantics.\smallskip |
66 explain why the gaps in this argument cannot be filled easily.\smallskip |
|
67 |
75 |
68 {\bf Keywords:} POSIX matching, Derivatives of Regular Expressions, |
76 {\bf Keywords:} POSIX matching, Derivatives of Regular Expressions, |
69 Isabelle/HOL |
77 Isabelle/HOL |
70 \end{abstract} |
78 \end{abstract} |
71 |
79 |