thys/Paper/Paper.thy
changeset 173 80fe81a28a52
parent 172 cdc0bdcfba3f
child 174 4e3778f4a802
equal deleted inserted replaced
172:cdc0bdcfba3f 173:80fe81a28a52
   108 completely formalised correctness proof of this matcher in for example HOL4
   108 completely formalised correctness proof of this matcher in for example HOL4
   109 has been mentioned by Owens and Slind~\cite{Owens2008}. Another one in Isabelle/HOL is part
   109 has been mentioned by Owens and Slind~\cite{Owens2008}. Another one in Isabelle/HOL is part
   110 of the work by Krauss and Nipkow \cite{Krauss2011}. And another one in Coq is given
   110 of the work by Krauss and Nipkow \cite{Krauss2011}. And another one in Coq is given
   111 by Coquand and Siles \cite{Coquand2012}.
   111 by Coquand and Siles \cite{Coquand2012}.
   112 
   112 
   113 One limitation of Brzozowski's matcher is that it only generates a YES/NO
   113 One limitation of Brzozowski's matcher is that it only generates a
   114 answer for whether a string is being matched by a regular expression.
   114 YES/NO answer for whether a string is being matched by a regular
   115 Sulzmann and Lu \cite{Sulzmann2014} extended this matcher to allow
   115 expression.  Sulzmann and Lu \cite{Sulzmann2014} extended this matcher
   116 generation not just of a YES/NO answer but of an actual matching, called a
   116 to allow generation not just of a YES/NO answer but of an actual
   117 [lexical] {\em value}. They give a simple algorithm to calculate a value
   117 matching, called a [lexical] {\em value}. They give a simple algorithm
   118 that appears to be the value associated with POSIX matching
   118 to calculate a value that appears to be the value associated with
   119 \cite{Kuklewicz,Vansummeren2006}. The challenge then is to specify that
   119 POSIX matching. The challenge then is to specify that value, in an
   120 value, in an algorithm-independent fashion, and to show that Sulzmann and
   120 algorithm-independent fashion, and to show that Sulzmann and Lu's
   121 Lu's derivative-based algorithm does indeed calculate a value that is
   121 derivative-based algorithm does indeed calculate a value that is
   122 correct according to the specification.
   122 correct according to the specification.
   123 
   123 
   124 The answer given by Sulzmann and Lu \cite{Sulzmann2014} is to define a
   124 The answer given by Sulzmann and Lu \cite{Sulzmann2014} is to define a
   125 relation (called an ``order relation'') on the set of values of @{term
   125 relation (called an ``order relation'') on the set of values of @{term
   126 r}, and to show that (once a string to be matched is chosen) there is
   126 r}, and to show that (once a string to be matched is chosen) there is
   303   @{thm (lhs) nullable.simps(2)} & $\dn$ & @{thm (rhs) nullable.simps(2)}\\
   303   @{thm (lhs) nullable.simps(2)} & $\dn$ & @{thm (rhs) nullable.simps(2)}\\
   304   @{thm (lhs) nullable.simps(3)} & $\dn$ & @{thm (rhs) nullable.simps(3)}\\
   304   @{thm (lhs) nullable.simps(3)} & $\dn$ & @{thm (rhs) nullable.simps(3)}\\
   305   @{thm (lhs) nullable.simps(4)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) nullable.simps(4)[of "r\<^sub>1" "r\<^sub>2"]}\\
   305   @{thm (lhs) nullable.simps(4)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) nullable.simps(4)[of "r\<^sub>1" "r\<^sub>2"]}\\
   306   @{thm (lhs) nullable.simps(5)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) nullable.simps(5)[of "r\<^sub>1" "r\<^sub>2"]}\\
   306   @{thm (lhs) nullable.simps(5)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) nullable.simps(5)[of "r\<^sub>1" "r\<^sub>2"]}\\
   307   @{thm (lhs) nullable.simps(6)} & $\dn$ & @{thm (rhs) nullable.simps(6)}\medskip\\
   307   @{thm (lhs) nullable.simps(6)} & $\dn$ & @{thm (rhs) nullable.simps(6)}\medskip\\
   308   \end{tabular}
   308   
   309   \end{center}
   309   %\end{tabular}
   310 
   310   %\end{center}
   311   \begin{center}
   311 
   312   \begin{tabular}{lcl}
   312   %\begin{center}
       
   313   %\begin{tabular}{lcl}
       
   314   
   313   @{thm (lhs) der.simps(1)} & $\dn$ & @{thm (rhs) der.simps(1)}\\
   315   @{thm (lhs) der.simps(1)} & $\dn$ & @{thm (rhs) der.simps(1)}\\
   314   @{thm (lhs) der.simps(2)} & $\dn$ & @{thm (rhs) der.simps(2)}\\
   316   @{thm (lhs) der.simps(2)} & $\dn$ & @{thm (rhs) der.simps(2)}\\
   315   @{thm (lhs) der.simps(3)} & $\dn$ & @{thm (rhs) der.simps(3)}\\
   317   @{thm (lhs) der.simps(3)} & $\dn$ & @{thm (rhs) der.simps(3)}\\
   316   @{thm (lhs) der.simps(4)[of c "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) der.simps(4)[of c "r\<^sub>1" "r\<^sub>2"]}\\
   318   @{thm (lhs) der.simps(4)[of c "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) der.simps(4)[of c "r\<^sub>1" "r\<^sub>2"]}\\
   317   @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]}\\
   319   @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]}\\
   606   functional programming language and also in Isabelle/HOL. In the remaining
   608   functional programming language and also in Isabelle/HOL. In the remaining
   607   part of this section we prove that this algorithm is correct.
   609   part of this section we prove that this algorithm is correct.
   608 
   610 
   609   The well-known idea of POSIX matching is informally defined by the longest
   611   The well-known idea of POSIX matching is informally defined by the longest
   610   match and priority rule (see Introduction); as correctly argued in \cite{Sulzmann2014}, this
   612   match and priority rule (see Introduction); as correctly argued in \cite{Sulzmann2014}, this
   611   needs formal specification. Sulzmann and Lu define a \emph{dominance}
   613   needs formal specification. Sulzmann and Lu define an ``ordering
   612   relation\footnote{Sulzmann and Lu call it an ordering relation, but
   614   relation'' between values and argue
   613   without giving evidence that it is transitive.} between values and argue
       
   614   that there is a maximum value, as given by the derivative-based algorithm.
   615   that there is a maximum value, as given by the derivative-based algorithm.
   615   In contrast, we shall introduce a simple inductive definition that
   616   In contrast, we shall introduce a simple inductive definition that
   616   specifies directly what a \emph{POSIX value} is, incorporating the
   617   specifies directly what a \emph{POSIX value} is, incorporating the
   617   POSIX-specific choices into the side-conditions of our rules. Our
   618   POSIX-specific choices into the side-conditions of our rules. Our
   618   definition is inspired by the matching relation given by Vansummeren
   619   definition is inspired by the matching relation given by Vansummeren
   854 
   855 
   855   \noindent is well understood, there is an obstacle with the POSIX value
   856   \noindent is well understood, there is an obstacle with the POSIX value
   856   calculation algorithm by Sulzmann and Lu: if we build a derivative regular
   857   calculation algorithm by Sulzmann and Lu: if we build a derivative regular
   857   expression and then simplify it, we will calculate a POSIX value for this
   858   expression and then simplify it, we will calculate a POSIX value for this
   858   simplified derivative regular expression, \emph{not} for the original (unsimplified)
   859   simplified derivative regular expression, \emph{not} for the original (unsimplified)
   859   derivative regular expression. Sulzmann and Lu overcome this obstacle by
   860   derivative regular expression. Sulzmann and Lu \cite{Sulzmann2014} overcome this obstacle by
   860   not just calculating a simplified regular expression, but also calculating
   861   not just calculating a simplified regular expression, but also calculating
   861   a \emph{rectification function} that ``repairs'' the incorrect value.
   862   a \emph{rectification function} that ``repairs'' the incorrect value.
   862   
   863   
   863   The rectification functions can be (slightly clumsily) implemented  in
   864   The rectification functions can be (slightly clumsily) implemented  in
   864   Isabelle/HOL as follows using some auxiliary functions:
   865   Isabelle/HOL as follows using some auxiliary functions:
  1140   that there are more serious problems. 
  1141   that there are more serious problems. 
  1141   
  1142   
  1142   Having proved the correctness of the POSIX lexing algorithm in
  1143   Having proved the correctness of the POSIX lexing algorithm in
  1143   \cite{Sulzmann2014}, which lessons have we learned? Well, this is a
  1144   \cite{Sulzmann2014}, which lessons have we learned? Well, this is a
  1144   perfect example for the importance of the \emph{right} definitions. We
  1145   perfect example for the importance of the \emph{right} definitions. We
  1145   have (on and off) banged our heads on doors as soon as as first versions
  1146   have (on and off) banged our heads on doors as soon as first versions
  1146   of \cite{Sulzmann2014} appeared, but have made little progress with
  1147   of \cite{Sulzmann2014} appeared, but have made little progress with
  1147   turning the relatively detailed proof sketch in \cite{Sulzmann2014} into a
  1148   turning the relatively detailed proof sketch in \cite{Sulzmann2014} into a
  1148   formalisable proof. Having seen \cite{Vansummeren2006} and adapting the
  1149   formalisable proof. Having seen \cite{Vansummeren2006} and adapted the
  1149   POSIX definition given there for the algorithm by Sulzmann and Lu made all
  1150   POSIX definition given there for the algorithm by Sulzmann and Lu made all
  1150   the difference: the proofs, as said, are nearly straightforward. The
  1151   the difference: the proofs, as said, are nearly straightforward. The
  1151   question remains whether the original proof idea of \cite{Sulzmann2014},
  1152   question remains whether the original proof idea of \cite{Sulzmann2014},
  1152   potentially using our result as a stepping stone, can be made to work?
  1153   potentially using our result as a stepping stone, can be made to work?
  1153   Alas, we really do not know despite considerable effort and door banging.
  1154   Alas, we really do not know despite considerable effort and door banging.