lexing: comparison thys/Paper/Paper.thy

equal deleted inserted replaced

-:79efc0bcfc96
+:71e26f43c896
 \item[$\bullet$] \underline{The Longest Match Rule (or ``maximal munch rule''):}
 The longest initial substring matched by any regular expression is taken as
 next token.\smallskip
-\item[$\bullet$] \underline{Rule Priority:}
+\item[$\bullet$] \underline{Priority Rule:}
 For a particular longest initial substring, the first regular expression
 that can match determines the token.
 \end{itemize}
 \end{center}
 \noindent Note that no values are associated with the regular expression
 @{term ZERO}, and that the only value associated with the regular
 expression @{term ONE} is @{term Void}, pronounced (if one must) as {\em
-``Void''}. It is routine to stablish how values ``inhabiting'' a regular
+``Void''}. It is routine to establish how values ``inhabiting'' a regular
 expression correspond to the language of a regular expression, namely
 \begin{proposition}
 @{thm L_flat_Prf}
 \end{proposition}
 & & \phantom{$|$} @{term "None"}  @{text "\<Rightarrow>"} @{term None}\\
 & & $|$ @{term "Some v"} @{text "\<Rightarrow>"} @{term "Some (injval r c v)"}
 \end{tabular}
 \end{center}
+\noindent If the regular expression does not match, @{const None} is
+returned. If the regular expression does match the string, then @{const
-NOT DONE YET
+Some} value is returned. Again the virtues of this algorithm is that it
+can be implemented with ease in a functional programming language and also
-Therefore there are, for example, three
+in Isabelle/HOL. In the remaining part of this section we prove that
-cases for sequence regular expressions (for all possible shapes of the
+this algorithm is correct.
-value).
+The well-known idea of POSIX matching is informally defined by the longest
-Again the virtues of this algorithm is that it can be
+match and priority rule; as correctly argued in \cite{Sulzmann2014}, this
-implemented with ease in a functional programming language and
+needs formal specification.
-also in Isabelle/HOL.
+We use a simple inductive definition to specify this notion, incorporating
-The well-known idea of POSIX lexing is informally defined in (for example)
+the POSIX-specific choices into the side-conditions for the rules $R tl
-\cite{posix}; as correctly argued in \cite{Sulzmann2014}, this needs formal
++_2$, $R tl\circ$ and $R tl*$ (as they are now called). By contrast,
-specification. The rough idea is that, in contrast to the so-called GREEDY
+\cite{Sulzmann2014} defines a relation between values and argues that there is a
-algorithm, POSIX lexing chooses to match more deeply and using left choices
+maximum value, as given by the derivative-based algorithm yet to be spelt
-rather than a right choices. For example, note that to match the string
+out. The relation we define is ternary, relating strings, values and regular
-@{term "[a, b]"} with the regular expression $(a + \mts)\circ (b+ab)$ the matching
+expressions.
-will return $( Void, Right(ab))$ rather than $(Left\ a, Left\ b)$. [The
-regular expression $ab$ is short for $(Lit\ a) \circ (Lit\ b)$.] Similarly,
-to match {\em ``a''} with $(a+a)$ the leftmost $a$ will be chosen.
-We use a simple inductive definition to specify this notion, incorporating
-the POSIX-specific choices into the side-conditions for the rules $R tl
-+_2$, $R tl\circ$ and $R tl*$ (as they are now called). By contrast,
-\cite{Sulzmann2014} defines a relation between values and argues that there is a
-maximum value, as given by the derivative-based algorithm yet to be spelt
-out. The relation we define is ternary, relating strings, values and regular
-expressions.
 Our Posix relation @{term "s \<in> r \<rightarrow> v"}
 \begin{center}
 \begin{tabular}{c}

changeset 119	71e26f43c896
parent 118	79efc0bcfc96
child 120	d74bfa11802c