lexing: comparison thys/Paper/Paper.thy

equal deleted inserted replaced

-:15ef2af1a6f2
+:022503caa187
 abbreviation
 "ders_syn r s \<equiv> ders s r"
 notation (latex output)
 If  ("(\<^raw:\textrm{>if\<^raw:}> (_)/ \<^raw:\textrm{>then\<^raw:}> (_)/ \<^raw:\textrm{>else\<^raw:}> (_))" 10) and
-Cons ("_\<^raw:\mbox{$\,$}>::\<^raw:\mbox{$\,$}>_" [78,77] 73) and
+Cons ("_\<^raw:\mbox{$\,$}>::\<^raw:\mbox{$\,$}>_" [75,75] 73) and
-ZERO ("\<^bold>0" 80) and
+ZERO ("\<^bold>0" 78) and
-ONE ("\<^bold>1" 80) and
+ONE ("\<^bold>1" 78) and
 CHAR ("_" [1000] 80) and
 ALT ("_ + _" [77,77] 78) and
 SEQ ("_ \<cdot> _" [77,77] 78) and
 STAR ("_\<^sup>\<star>" [1000] 78) and
 val.Void ("'(')" 79) and
 val.Char ("Char _" [1000] 79) and
-val.Left ("Left _" [1000] 78) and
+val.Left ("Left _" [79] 78) and
-val.Right ("Right _" [1000] 78) and
+val.Right ("Right _" [79] 78) and
+val.Seq ("Seq _ _" [79,79] 78) and
+val.Stars ("Stars _" [79] 78) and
 L ("L'(_')" [10] 78) and
 der_syn ("_\\_" [79, 1000] 76) and
 ders_syn ("_\\_" [79, 1000] 76) and
 flat ("|_|" [75] 73) and
 Sequ ("_ @ _" [78,77] 63) and
-injval ("inj _ _ _" [78,77,78] 77) and
+injval ("inj _ _ _" [79,77,79] 76) and
+mkeps ("mkeps _" [79] 76) and
 projval ("proj _ _ _" [1000,77,1000] 77) and
 length ("len _" [78] 73) and
 Prf ("\<triangleright> _ : _" [75,75] 75) and
-PMatch ("_ : _ \<rightarrow> _" [75,75,75] 75)
+PMatch ("'(_, _') \<rightarrow> _" [63,75,75] 75)
 (* and ValOrd ("_ \<succeq>\<^bsub>_\<^esub> _" [78,77,77] 73) *)
 definition
 "match r s \<equiv> nullable (ders s r)"
 and only if \mbox{@{term "s \<in> L(der c r)"}}. The beauty of Brzozowski's
 derivatives is that they are neatly expressible in any functional language,
 and easily definable and reasoned about in theorem provers---the definitions
 just consist of inductive datatypes and simple recursive functions. A
 completely formalised correctness proof of this matcher in for example HOL4
-has been mentioned in~\cite{Owens2008}.
+has been mentioned in~\cite{Owens2008}. Another one in Isabelle/HOL is
+in \cite{Krauss2011}.
 One limitation of Brzozowski's matcher is that it only generates a YES/NO
 answer for whether a string is being matched by a regular expression.
 Sulzmann and Lu \cite{Sulzmann2014} extended this matcher to allow
 generation not just of a YES/NO answer but of an actual matching, called a
 proof idea is inspired by work of Frisch and Cardelli \cite{Frisch2004} on a
 GREEDY regular expression matching algorithm. Beginning with our
 observations that, without evidence that it is transitive, it cannot be
 called an ``order relation'', and that the relation is called a ``total
 order'' despite being evidently not total\footnote{The relation @{text
-"\<ge>\<^bsub>r\<^esub>"} defined in \cite{Sulzmann2014} is a relation on the values for
+"\<ge>\<^bsub>r\<^esub>"} defined in \cite{Sulzmann2014} is a relation on the
-the regular expression @{term r}; but it only holds between @{term v} and
+values for the regular expression @{term r}; but it only holds between
-@{term "v'"} in cases where @{term v} and @{term "v'"} have the same
+@{term v} and @{term "v'"} in cases where @{term v} and @{term "v'"} have
-flattening (underlying string). So a counterexample to totality is given by
+the same flattening (underlying string). So a counterexample to totality is
-taking two values @{term v} and @{term "v'"} for @{term r} that have
+given by taking two values @{term v} and @{term "v'"} for @{term r} that
-different flattenings. A different relation @{text "\<ge>\<^bsub>r,s\<^esub>"}
+have different flattenings (see Section~\ref{posixsec}). A different
-on the set of values for @{term r} with flattening @{term s} is definable by
+relation @{text "\<ge>\<^bsub>r,s\<^esub>"} on the set of values for @{term r}
-the same approach, and is indeed total; but that is not what Proposition 1
+with flattening @{term s} is definable by the same approach, and is indeed
-of \cite{Sulzmann2014} does.}, we identify problems with this approach (of
+total; but that is not what Proposition 1 of \cite{Sulzmann2014} does.}, we
-which some of the proofs are not published in \cite{Sulzmann2014}); perhaps
+identify problems with this approach (of which some of the proofs are not
-more importantly, we give a simple inductive (and algorithm-independent)
+published in \cite{Sulzmann2014}); perhaps more importantly, we give a
-definition of what we call being a {\em POSIX value} for a regular
+simple inductive (and algorithm-independent) definition of what we call
-expression @{term r} and a string @{term s}; we show that the algorithm
+being a {\em POSIX value} for a regular expression @{term r} and a string
-computes such a value and that such a value is unique. Proofs are both done
+@{term s}; we show that the algorithm computes such a value and that such a
-by hand and checked in Isabelle/HOL. The experience of doing our proofs has
+value is unique. Proofs are both done by hand and checked in Isabelle/HOL.
-been that this mechanical checking was absolutely essential: this subject
+The experience of doing our proofs has been that this mechanical checking
-area has hidden snares. This was also noted by Kuklewitz \cite{Kuklewicz}
+was absolutely essential: this subject area has hidden snares. This was also
-who found that nearly all POSIX matching implementations are ``buggy''
+noted by Kuklewitz \cite{Kuklewicz} who found that nearly all POSIX matching
-\cite[Page 203]{Sulzmann2014}.
+implementations are ``buggy'' \cite[Page 203]{Sulzmann2014}.
 If a regular expression matches a string, then in general there is more than
 one way of how the string is matched. There are two commonly used
 disambiguation strategies to generate a unique answer: one is called GREEDY
 matching \cite{Frisch2004} and the other is POSIX
 that can match determines the token.
 \end{itemize}
 \noindent Consider for example @{text "r\<^bsub>key\<^esub>"} recognising keywords such as
 @{text "if"}, @{text "then"} and so on; and @{text "r\<^bsub>id\<^esub>"} recognising
-identifiers (a single character followed by characters or numbers). Then we
+identifiers (say, a single character followed by characters or numbers). Then we
 can form the regular expression @{text "(r\<^bsub>key\<^esub> + r\<^bsub>id\<^esub>)\<^sup>\<star>"} and use POSIX
 matching to tokenise strings, say @{text "iffoo"} and @{text "if"}. In the
 first case we obtain by the longest match rule a single identifier token,
 not a keyword followed by an identifier. In the second case we obtain by rule
 priority a keyword token, not an identifier token---even if @{text "r\<^bsub>id\<^esub>"}
 matches also.\bigskip
-\noindent {\bf Contributions:} We have implemented in Isabelle/HOL the
+\noindent {\bf Contributions:} (NOT DONE YET) We have implemented in Isabelle/HOL the
 derivative-based regular expression matching algorithm as described by
 Sulzmann and Lu \cite{Sulzmann2014}. We have proved the correctness of this
-algorithms according to our specification what a POSIX value is. The
+algorithm according to our specification of what a POSIX value is. The
 informal correctness proof given in \cite{Sulzmann2014} is in final
 form\footnote{} and to us contains unfillable gaps. Our specification of a
 POSIX value consists of a simple inductive definition that given a string
 and a regular expression uniquely determines this value. Derivatives as
 calculated by Brzozowski's method are usually more complex regular
 list-cons being written as @{term "DUMMY # DUMMY"}. Often we use the usual
 bracket notation for lists also for strings; for example a string consisting
 of just a single character @{term c} is written @{term "[c]"}. By using the
 type @{type char} for characters we have a supply of finitely many
 characters roughly corresponding to the ASCII character set. Regular
-expressions are defined as usual as the following inductive datatype:
+expressions are defined as usual as the elements of the following inductive
+datatype:
 \begin{center}
 @{text "r :="}
 @{const "ZERO"} $\mid$
 @{const "ONE"} $\mid$
 \end{center}
 \noindent where @{const ZERO} stands for the regular expression that does
 not match any string, @{const ONE} for the regular expression that matches
 only the empty string and @{term c} for matching a character literal. The
-language of a regular expression is again defined as usual by the
+language of a regular expression is also defined as usual by the
 recursive function @{term L} with the clauses:
 \begin{center}
 \begin{tabular}{rcl}
 @{thm (lhs) L.simps(1)} & $\dn$ & @{thm (rhs) L.simps(1)}\\
 @{thm (lhs) L.simps(5)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) L.simps(5)[of "r\<^sub>1" "r\<^sub>2"]}\\
 @{thm (lhs) L.simps(6)} & $\dn$ & @{thm (rhs) L.simps(6)}\\
 \end{tabular}
 \end{center}
-\noindent In the fourth clause we use @{term "DUMMY ;; DUMMY"} for the
+\noindent In the fourth clause we use the operation @{term "DUMMY ;;
-concatenation of two languages (it is also list-append for strings). We
+DUMMY"} for the concatenation of two languages (it is also list-append for
-use the star-notation for regular expressions and languages (in the last
+strings). We use the star-notation for regular expressions and languages
-clause above). The star on languages is defined inductively by two
+(in the last clause above). The star on languages is defined inductively
-clauses: @{text "(i)"} for the empty string being in the star of a
+by two clauses: @{text "(i)"} for the empty string being in the star of a
 language and @{text "(ii)"} if @{term "s\<^sub>1"} is in a language and @{term
 "s\<^sub>2"} in the star of this language, then also @{term "s\<^sub>1 @ s\<^sub>2"} is in
 the star of this language. It will also be convenient to use the following
-notion of a \emph{semantic derivative} (or left quotient) of a language,
+notion of a \emph{semantic derivative} (or \emph{left quotient}) of a
-say @{text A}, defined as:
+language, say @{text A}, defined as:
 \begin{center}
 \begin{tabular}{lcl}
 @{thm (lhs) Der_def} & $\dn$ & @{thm (rhs) Der_def}\\
 \end{tabular}
 \end{center}
 \noindent
 For semantic derivatives we have the following equations (for example
-\cite{Krauss2011}):
+mechanically proved in \cite{Krauss2011}):
 \begin{equation}\label{SemDer}
 \begin{array}{lcl}
 @{thm (lhs) Der_null}  & \dn & @{thm (rhs) Der_null}\\
 @{thm (lhs) Der_empty}  & \dn & @{thm (rhs) Der_empty}\\
 exercise in mechanical reasoning to establish that
 \begin{proposition}\mbox{}\\
 \begin{tabular}{ll}
 @{text "(1)"} & @{thm (lhs) nullable_correctness} if and only if
-@{thm (rhs) nullable_correctness} \\
+@{thm (rhs) nullable_correctness}, and \\
-@{text "(2)"} & @{thm[mode=IfThen] der_correctness}
+@{text "(2)"} & @{thm[mode=IfThen] der_correctness}.
 \end{tabular}
 \end{proposition}
 \noindent With this in place it is also very routine to prove that the
 regular expression matcher defined as
 \begin{center}
 @{thm match_def}
 \end{center}
-\noindent gives a positive answer if and only if @{term "s \<in> L r"}. While
+\noindent gives a positive answer if and only if @{term "s \<in> L r"}.
-the matcher above calculates a provably correct a YES/NO answer for
+Consequently, this regular expression matching algorithm satisfies the
-whether a regular expression matches a string, the novel idea of Sulzmann
+usual specification. While the matcher above calculates a provably correct
-and Lu \cite{Sulzmann2014} is to append another phase to calculate a
+a YES/NO answer for whether a regular expression matches a string, the
-value. We will explain the details next.
+novel idea of Sulzmann and Lu \cite{Sulzmann2014} is to append another
+phase to this algorithm in order to calculate a [lexical] value. We will
+explain the details next.
 *}
-section {* POSIX Regular Expression Matching *}
+section {* POSIX Regular Expression Matching\label{posixsec} *}
 text {*
 The clever idea in \cite{Sulzmann2014} is to introduce values for encoding
 \emph{how} a regular expression matches a string and then define a function
 on values that mirrors (but inverts) the construction of the derivative on
-regular expressions. Values are defined as the inductive datatype
+regular expressions. \emph{Values} are defined as the inductive datatype
 \begin{center}
 @{text "v :="}
 @{const "Void"} $\mid$
 @{term "val.Char c"} $\mid$
 @{term "Right v"} $\mid$
 @{term "Seq v\<^sub>1 v\<^sub>2"} $\mid$
 @{term "Stars vs"}
 \end{center}
-\noindent where we use @{term vs} standing for a list of values. The string
+\noindent where we use @{term vs} standing for a list of values. (This is
-underlying a values can be calculated by the @{const flat} function, written
+similar to the approach taken by Frisch and Cardelli for GREEDY matching
-@{term "flat DUMMY"} and defined as:
+\cite{Frisch2014}, and Sulzmann and Lu \cite{2014} for POSIX matching).
+The string underlying a value can be calculated by the @{const flat}
+function, written @{term "flat DUMMY"} and defined as:
 \begin{center}
 \begin{tabular}{lcl}
 @{thm (lhs) flat.simps(1)} & $\dn$ & @{thm (rhs) flat.simps(1)}\\
 @{thm (lhs) flat.simps(2)} & $\dn$ & @{thm (rhs) flat.simps(2)}\\
 \noindent Sulzmann and Lu also define inductively an inhabitation relation
 that associates values to regular expressions:
 \begin{center}
 \begin{tabular}{c}
-@{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\medskip\\
+@{thm[mode=Axiom] Prf.intros(4)} \qquad
+@{thm[mode=Axiom] Prf.intros(5)[of "c"]}\medskip\\
 @{thm[mode=Rule] Prf.intros(2)[of "v\<^sub>1" "r\<^sub>1" "r\<^sub>2"]} \qquad
 @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\medskip\\
-@{thm[mode=Axiom] Prf.intros(4)} \qquad
+@{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\medskip\\
-@{thm[mode=Axiom] Prf.intros(5)[of "c"]}\medskip\\
+@{thm[mode=Axiom] Prf.intros(6)[of "r"]} \qquad
-@{thm[mode=Axiom] Prf.intros(6)[of "r"]} \qquad
 @{thm[mode=Rule] Prf.intros(7)[of "v" "r" "vs"]}\medskip\\
 \end{tabular}
 \end{center}
 \noindent Note that no values are associated with the regular expression
 @{term ZERO}, and that the only value associated with the regular
 expression @{term ONE} is @{term Void}, pronounced (if one must) as {\em
-``Void''}. It is routine to stablish how values `inhabitated' by a regular
+``Void''}. It is routine to stablish how values ``inhabiting'' a regular
 expression correspond to the language of a regular expression, namely
 \begin{proposition}
 @{thm L_flat_Prf}
 \end{proposition}
-Graphically the algorithm by Sulzmann \& Lu can be illustrated by the
+In general there are more than one value associated with a regular
-picture in Figure~\ref{Sulz} where the path from the left to the right
+expression. In case of POSIX matching the problem is to calculate the
-involving $der/nullable$ is the first phase of the algorithm and
+unique value that satisfies the (informal) POSIX constraints from the
-$mkeps/inj$, the path from right to left, the second phase. This picture
+Introduction. Graphically the regular expression matching algorithm by
-shows the steps required when a regular expression, say $r_1$, matches the
+Sulzmann and Lu can be illustrated by the picture in Figure~\ref{Sulz}
-string $abc$. We first build the three derivatives (according to $a$, $b$
+where the path from the left to the right involving @{const der}/@{const
-and $c$). We then use $nullable$ to find out whether the resulting regular
+nullable} is the first phase of the algorithm (calculating successive
-expression can match the empty string. If yes, we call the function
+\Brz's derivatives) and @{const mkeps}/@{text inj}, the path from right to
-$mkeps$.
+left, the second phase. This picture shows the steps required when a
+regular expression, say @{text "r\<^sub>1"}, matches the string @{term
+"[a,b,c]"}. We first build the three derivatives (according to @{term a},
+@{term b} and @{term c}). We then use @{const nullable} to find out
+whether the resulting derivative regular expression @{term "r\<^sub>4"}
+can match the empty string. If yes, we call the function @{const mkeps}
+that produces a value @{term "v\<^sub>4"} for how @{term "r\<^sub>4"} can
+match the empty string (taking into account the POSIX constraints in case
+there are several ways). This functions is defined by the clauses:
 \begin{figure}[t]
 \begin{center}
 \begin{tikzpicture}[scale=2,node distance=1.3cm,
 every node/.style={minimum size=7mm}]
 that the function @{term inj} `injects back' the characters of the string into
 the values (the arrows from right to left).
 \label{Sulz}}
 \end{figure}
-NOT DONE YET
+\begin{center}
-We begin with the case of a nullable regular expression: from the
-nullability we need to construct a value that witnesses the nullability.
-The @{const mkeps} function (from \cite{Sulzmann2014}) is a partial (but
-unambiguous) function from regular expressions to values, total on exactly
-the set of nullable regular expressions.
-\begin{center}
 \begin{tabular}{lcl}
 @{thm (lhs) mkeps.simps(1)} & $\dn$ & @{thm (rhs) mkeps.simps(1)}\\
 @{thm (lhs) mkeps.simps(2)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) mkeps.simps(2)[of "r\<^sub>1" "r\<^sub>2"]}\\
 @{thm (lhs) mkeps.simps(3)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) mkeps.simps(3)[of "r\<^sub>1" "r\<^sub>2"]}\\
 @{thm (lhs) mkeps.simps(4)} & $\dn$ & @{thm (rhs) mkeps.simps(4)}\\
 \end{tabular}
 \end{center}
-The well-known idea of POSIX lexing is informally defined in (for example)
+\noindent Note that this function needs only to be partially defined,
-\cite{posix}; as correctly argued in \cite{Sulzmann2014}, this needs formal
+namely only for regular expressions that are nullable. In case @{const
-specification. The rough idea is that, in contrast to the so-called GREEDY
+nullable} fails, the string @{term "[a,b,c]"} cannot be matched by @{term
-algorithm, POSIX lexing chooses to match more deeply and using left choices
+"r\<^sub>1"} and an error is raised. Note also how this function makes
-rather than a right choices. For example, note that to match the string
+some subtle choices leading to a POSIX value: for example if the
-@{term "[a, b]"} with the regular expression $(a + \mts)\circ (b+ab)$ the matching
+alternative, say @{term "ALT r\<^sub>1 r\<^sub>2"}, can match the empty
-will return $( Void, Right(ab))$ rather than $(Left\ a, Left\ b)$. [The
+string and furthermore @{term "r\<^sub>1"} can match the empty string,
-regular expression $ab$ is short for $(Lit\ a) \circ (Lit\ b)$.] Similarly,
+then we return a @{const Left}-value. The @{const Right}-value will only
-to match {\em ``a''} with $(a+a)$ the leftmost $a$ will be chosen.
+be returned if @{term "r\<^sub>1"} is not nullable.
-We use a simple inductive definition to specify this notion, incorporating
+The most interesting novelty from Sulzmann and Lu \cite{Sulzmann2014} is
-the POSIX-specific choices into the side-conditions for the rules $R tl
+the construction value for how @{term "r\<^sub>1"} can match the string
-+_2$, $R tl\circ$ and $R tl*$ (as they are now called). By contrast,
+@{term "[a,b,c]"} from the value how the last derivative, @{term
-\cite{Sulzmann2014} defines a relation between values and argues that there is a
+"r\<^sub>4"} in Fig~\ref{Sulz}, can match the empty string. Sulzmann and
-maximum value, as given by the derivative-based algorithm yet to be spelt
+Lu acchieve this by stepwise ``injecting back'' the characters into the
-out. The relation we define is ternary, relating strings, values and regular
+values thus inverting the operation of building derivatives on the level
-expressions.
+of values. The corresponding function, called @{term inj}, takes three
+arguments, a regular expression, a character and a value. For example in
-Our Posix relation @{term "s \<in> r \<rightarrow> v"}
+the first @{term inj}-step in Fig~\ref{Sulz} the regular expression @{term
+"r\<^sub>3"}, the character @{term c} from the last derivative step and
-\begin{center}
+@{term "v\<^sub>4"}, which is the value corresponding to the derivative
-\begin{tabular}{c}
+regular expression @{term "r\<^sub>4"}. The result is the new value @{term
-@{thm[mode=Axiom] PMatch.intros(1)} \qquad
+"v\<^sub>3"}. The final result of the algorithm is the value @{term
-@{thm[mode=Axiom] PMatch.intros(2)}\medskip\\
+"v\<^sub>1"} corresponding to the input regular expression. The @{term
-@{thm[mode=Rule] PMatch.intros(3)[of "s" "r\<^sub>1" "v" "r\<^sub>2"]}\qquad
+inj} function is by recursion on the regular expression and by analysing
-@{thm[mode=Rule] PMatch.intros(4)[of "s" "r\<^sub>2" "v" "r\<^sub>1"]}\medskip\\
+the shape of values.
-\multicolumn{1}{p{5cm}}{@{thm[mode=Rule] PMatch.intros(5)[of "s\<^sub>1" "r\<^sub>1" "v\<^sub>1" "s\<^sub>2" "r\<^sub>2" "v\<^sub>2"]}}\medskip\\
-@{thm[mode=Rule] PMatch.intros(6)[of "s\<^sub>1" "r" "v" "s\<^sub>2" "vs"]}\medskip\\
-@{thm[mode=Axiom] PMatch.intros(7)}\medskip\\
-\end{tabular}
-\end{center}
-*}
-section {* The Argument by Sulzmmann and Lu *}
-section {* Conclusion *}
-text {*
-Nipkow lexer from 2000
-*}
-text {*
-\noindent
-Values
-\noindent
-The @{const mkeps} function
-\noindent
-The @{text inj} function
 \begin{center}
 \begin{tabular}{lcl}
 @{thm (lhs) injval.simps(1)} & $\dn$ & @{thm (rhs) injval.simps(1)}\\
 @{thm (lhs) injval.simps(2)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>1"]} & $\dn$ &
 & @{thm (rhs) injval.simps(6)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>2"]}\\
 @{thm (lhs) injval.simps(7)[of "r" "c" "v" "vs"]} & $\dn$
 & @{thm (rhs) injval.simps(7)[of "r" "c" "v" "vs"]}\\
 \end{tabular}
 \end{center}
+\noindent To better understand what is going on in this definition it
+might be instructive to look first at the three sequence cases (clauses
+4--6). Recall the corresponding clause of the derivative
+\begin{center}
+@{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} $\dn$ @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]}
+\end{center}
+\noindent
+NOT DONE YET
+Therefore there are, for example, three
+cases for sequence regular expressions (for all possible shapes of the
+value).
+Again the virtues of this algorithm is that it can be
+implemented with ease in a functional programming language and
+also in Isabelle/HOL.
+The well-known idea of POSIX lexing is informally defined in (for example)
+\cite{posix}; as correctly argued in \cite{Sulzmann2014}, this needs formal
+specification. The rough idea is that, in contrast to the so-called GREEDY
+algorithm, POSIX lexing chooses to match more deeply and using left choices
+rather than a right choices. For example, note that to match the string
+@{term "[a, b]"} with the regular expression $(a + \mts)\circ (b+ab)$ the matching
+will return $( Void, Right(ab))$ rather than $(Left\ a, Left\ b)$. [The
+regular expression $ab$ is short for $(Lit\ a) \circ (Lit\ b)$.] Similarly,
+to match {\em ``a''} with $(a+a)$ the leftmost $a$ will be chosen.
+We use a simple inductive definition to specify this notion, incorporating
+the POSIX-specific choices into the side-conditions for the rules $R tl
++_2$, $R tl\circ$ and $R tl*$ (as they are now called). By contrast,
+\cite{Sulzmann2014} defines a relation between values and argues that there is a
+maximum value, as given by the derivative-based algorithm yet to be spelt
+out. The relation we define is ternary, relating strings, values and regular
+expressions.
+Our Posix relation @{term "s \<in> r \<rightarrow> v"}
+\begin{center}
+\begin{tabular}{c}
+@{thm[mode=Axiom] PMatch.intros(1)} \qquad
+@{thm[mode=Axiom] PMatch.intros(2)}\medskip\\
+@{thm[mode=Rule] PMatch.intros(3)[of "s" "r\<^sub>1" "v" "r\<^sub>2"]}\qquad
+@{thm[mode=Rule] PMatch.intros(4)[of "s" "r\<^sub>2" "v" "r\<^sub>1"]}\medskip\\
+\multicolumn{1}{p{5cm}}{@{thm[mode=Rule] PMatch.intros(5)[of "s\<^sub>1" "r\<^sub>1" "v\<^sub>1" "s\<^sub>2" "r\<^sub>2" "v\<^sub>2"]}}\medskip\\
+@{thm[mode=Rule] PMatch.intros(6)[of "s\<^sub>1" "r" "v" "s\<^sub>2" "vs"]}\medskip\\
+@{thm[mode=Axiom] PMatch.intros(7)}\medskip\\
+\end{tabular}
+\end{center}
+*}
+section {* The Argument by Sulzmmann and Lu *}
+section {* Conclusion *}
+text {*
+Nipkow lexer from 2000
+*}
+text {*
+\noindent
+Values
+\noindent
+The @{const mkeps} function
+\noindent
+The @{text inj} function
 \noindent
 The inhabitation relation:
 \begin{center}

changeset 116	022503caa187
parent 115	15ef2af1a6f2
child 117	2c4ffcc95399