97 value, in an algorithm-independent fashion, and to show that Sulzmann and |
97 value, in an algorithm-independent fashion, and to show that Sulzmann and |
98 Lu's derivative-based algorithm does indeed calculate a value that is |
98 Lu's derivative-based algorithm does indeed calculate a value that is |
99 correct according to the specification. |
99 correct according to the specification. |
100 |
100 |
101 The answer given by Sulzmann and Lu \cite{Sulzmann2014} is to define a |
101 The answer given by Sulzmann and Lu \cite{Sulzmann2014} is to define a |
102 relation (called an ``order relation'') on the set of values of @{term r}, |
102 relation (called an ``order relation'') on the set of values of @{term |
103 and to show that (once a string to be matched is chosen) there is a maximum |
103 r}, and to show that (once a string to be matched is chosen) there is |
104 element and that it is computed by their derivative-based algorithm. This |
104 a maximum element and that it is computed by their derivative-based |
105 proof idea is inspired by work of Frisch and Cardelli \cite{Frisch2004} on a |
105 algorithm. This proof idea is inspired by work of Frisch and Cardelli |
106 GREEDY regular expression matching algorithm. However, we were not able to |
106 \cite{Frisch2004} on a GREEDY regular expression matching |
107 establish transitivity and totality for the ``order relation'' by |
107 algorithm. However, we were not able to establish transitivity and |
108 Sulzmann and Lu. In Section \ref{arg} we |
108 totality for the ``order relation'' by Sulzmann and Lu. In Section |
109 identify problems with their approach (of which some of the proofs are not |
109 \ref{argu} we identify some inherent problems with their approach (of |
110 published in \cite{Sulzmann2014}); perhaps more importantly, we give a |
110 which some of the proofs are not published in \cite{Sulzmann2014}); |
111 simple inductive (and algorithm-independent) definition of what we call |
111 perhaps more importantly, we give a simple inductive (and |
112 being a {\em POSIX value} for a regular expression @{term r} and a string |
112 algorithm-independent) definition of what we call being a {\em POSIX |
113 @{term s}; we show that the algorithm computes such a value and that such a |
113 value} for a regular expression @{term r} and a string @{term s}; we |
114 value is unique. Proofs are both done by hand and checked in Isabelle/HOL. |
114 show that the algorithm computes such a value and that such a value is |
115 The experience of doing our proofs has been that this mechanical checking |
115 unique. Proofs are both done by hand and checked in Isabelle/HOL. The |
116 was absolutely essential: this subject area has hidden snares. This was also |
116 experience of doing our proofs has been that this mechanical checking |
117 noted by Kuklewicz \cite{Kuklewicz} who found that nearly all POSIX matching |
117 was absolutely essential: this subject area has hidden snares. This |
118 implementations are ``buggy'' \cite[Page 203]{Sulzmann2014}. |
118 was also noted by Kuklewicz \cite{Kuklewicz} who found that nearly all |
|
119 POSIX matching implementations are ``buggy'' \cite[Page |
|
120 203]{Sulzmann2014}. |
119 |
121 |
120 %\footnote{The relation @{text "\<ge>\<^bsub>r\<^esub>"} defined by Sulzmann and Lu \cite{Sulzmann2014} |
122 %\footnote{The relation @{text "\<ge>\<^bsub>r\<^esub>"} defined by Sulzmann and Lu \cite{Sulzmann2014} |
121 %is a relation on the |
123 %is a relation on the |
122 %values for the regular expression @{term r}; but it only holds between |
124 %values for the regular expression @{term r}; but it only holds between |
123 %@{term "v\<^sub>1"} and @{term "v\<^sub>2"} in cases where @{term "v\<^sub>1"} and @{term "v\<^sub>2"} have |
125 %@{term "v\<^sub>1"} and @{term "v\<^sub>2"} in cases where @{term "v\<^sub>1"} and @{term "v\<^sub>2"} have |