lexing: comparison thys/Journal/Paper.thy

equal deleted inserted replaced

-:f16019b11179
+:e85099ac4c6c
 syntax (latex output)
 "_Collect" :: "pttrn => bool => 'a set"              ("(1{_ \<^raw:\mbox{\boldmath$\mid$}> _})")
 "_CollectIn" :: "pttrn => 'a set => bool => 'a set"   ("(1{_ \<in> _ |e _})")
+syntax
+"_Not_Ex" :: "idts \<Rightarrow> bool \<Rightarrow> bool"  ("(3\<nexists>_.a./ _)" [0, 10] 10)
+"_Not_Ex1" :: "pttrn \<Rightarrow> bool \<Rightarrow> bool"  ("(3\<nexists>!_.a./ _)" [0, 10] 10)
 abbreviation
 "der_syn r c \<equiv> der c r"
 abbreviation
 notation (latex output)
 If  ("(\<^raw:\textrm{>if\<^raw:}> (_)/ \<^raw:\textrm{>then\<^raw:}> (_)/ \<^raw:\textrm{>else\<^raw:}> (_))" 10) and
 Cons ("_\<^raw:\mbox{$\,$}>::\<^raw:\mbox{$\,$}>_" [75,73] 73) and
-ZERO ("\<^bold>0" 78) and
+ZERO ("\<^bold>0" 81) and
-ONE ("\<^bold>1" 1000) and
+ONE ("\<^bold>1" 81) and
 CHAR ("_" [1000] 80) and
 ALT ("_ + _" [77,77] 78) and
 SEQ ("_ \<cdot> _" [77,77] 78) and
-STAR ("_\<^sup>\<star>" [1000] 78) and
+STAR ("_\<^sup>\<star>" [78] 78) and
 val.Void ("Empty" 78) and
 val.Char ("Char _" [1000] 78) and
 val.Left ("Left _" [79] 78) and
 val.Right ("Right _" [1000] 78) and
 L ("L'(_')" [10] 78) and
 LV ("LV _ _" [80,73] 78) and
 der_syn ("_\\_" [79, 1000] 76) and
 ders_syn ("_\\_" [79, 1000] 76) and
 flat ("|_|" [75] 74) and
+flats ("|_|" [72] 74) and
 Sequ ("_ @ _" [78,77] 63) and
 injval ("inj _ _ _" [79,77,79] 76) and
 mkeps ("mkeps _" [79] 76) and
 length ("len _" [73] 73) and
 intlen ("len _" [73] 73) and
 p9. The condition "not exists s3 s4..." appears often enough (in particular in
 the proof of Lemma 3) to warrant a definition.
 *)
 (*>*)
 section {* Introduction *}
 identifier. For @{text "if"} we obtain by the Priority Rule a keyword
 token, not an identifier token---even if @{text "r\<^bsub>id\<^esub>"}
 matches also. By the Star Rule we know @{text "(r\<^bsub>key\<^esub> +
 r\<^bsub>id\<^esub>)\<^sup>\<star>"} matches @{text "iffoo"},
 respectively @{text "if"}, in exactly one `iteration' of the star. The
-Empty String Rule is for cases where, for example, @{text
+Empty String Rule is for cases where, for example, the regular expression
-"(a\<^sup>\<star>)\<^sup>\<star>"} matches against the
+@{text "(a\<^sup>\<star>)\<^sup>\<star>"} matches against the
 string @{text "bc"}. Then the longest initial matched substring is the
 empty string, which is matched by both the whole regular expression
 and the parenthesised subexpression.
 to allow generation not just of a YES/NO answer but of an actual
 matching, called a [lexical] {\em value}. Assuming a regular
 expression matches a string, values encode the information of
 \emph{how} the string is matched by the regular expression---that is,
 which part of the string is matched by which part of the regular
-expression. For this consider again the the string @{text "xy"} and
+expression. For this consider again the string @{text "xy"} and
-the regular expression \mbox{@{text "(x + (y +
+the regular expression \mbox{@{text "(x + (y + xy))\<^sup>\<star>"}}
-xy))\<^sup>\<star>"}}. The POSIX value, which corresponds to using the
+(this time fully parenthesised). We can view this regular expression
-star in only one repetition,
+as tree and if the string @{text xy} is matched by two Star
+`iterations', then the @{text x} is matched by the left-most
+alternative in this tree and the @{text y} by the right-left alternative. This
-\marginpar{explain values; who introduced them}
+suggests to record this matching as
+\begin{center}
+@{term "Stars [Left(Char x), Right(Left(Char y))]"}
+\end{center}
+\noindent where @{const Stars}, @{text Left}, @{text Right} and @{text
+Char} are constructors for values. @{text Stars} records how many
+iterations were used; @{text Left}, respectively @{text Right}, which
+alternative is used. This `tree view' leads naturally to the
+idea that regular expressions act as types and values as inhabiting
+those types. This view was first put forward by ???. The value for the
+single `iteration', i.e.~the POSIX value, would look as follows
+\begin{center}
+@{term "Stars [Seq (Char x) (Char y)]"}
+\end{center}
+\noindent where @{const Stars} has only a single-element list for the
+single iteration and @{const Seq} indicates that @{term xy} is matched
+by a sequence regular expression, which we will in what follows
+write more formally as @{term "SEQ x y"}.
 Sulzmann and Lu give a simple algorithm to calculate a value that
 appears to be the value associated with POSIX matching.  The challenge
 then is to specify that value, in an algorithm-independent fashion,
 *}
 section {* Preliminaries *}
-text {* \noindent Strings in Isabelle/HOL are lists of characters with the
+text {* \noindent Strings in Isabelle/HOL are lists of characters with
-empty string being represented by the empty list, written @{term "[]"}, and
+the empty string being represented by the empty list, written @{term
-list-cons being written as @{term "DUMMY # DUMMY"}. Often we use the usual
+"[]"}, and list-cons being written as @{term "DUMMY # DUMMY"}. Often
-bracket notation for lists also for strings; for example a string consisting
+we use the usual bracket notation for lists also for strings; for
-of just a single character @{term c} is written @{term "[c]"}. By using the
+example a string consisting of just a single character @{term c} is
+written @{term "[c]"}. We use the usual definitions for
+\emph{prefixes} and \emph{strict prefixes} of strings.  By using the
 type @{type char} for characters we have a supply of finitely many
 characters roughly corresponding to the ASCII character set. Regular
-expressions are defined as usual as the elements of the following inductive
+expressions are defined as usual as the elements of the following
-datatype:
+inductive datatype:
 \begin{center}
 @{text "r :="}
 @{const "ZERO"} $\mid$
 @{const "ONE"} $\mid$
 language of a regular expression is also defined as usual by the
 recursive function @{term L} with the six clauses:
 \begin{center}
 \begin{tabular}{l@ {\hspace{4mm}}rcl}
-(1) & @{thm (lhs) L.simps(1)} & $\dn$ & @{thm (rhs) L.simps(1)}\\
+\textit{(1)} & @{thm (lhs) L.simps(1)} & $\dn$ & @{thm (rhs) L.simps(1)}\\
-(2) & @{thm (lhs) L.simps(2)} & $\dn$ & @{thm (rhs) L.simps(2)}\\
+\textit{(2)} & @{thm (lhs) L.simps(2)} & $\dn$ & @{thm (rhs) L.simps(2)}\\
-(3) & @{thm (lhs) L.simps(3)} & $\dn$ & @{thm (rhs) L.simps(3)}\\
+\textit{(3)} & @{thm (lhs) L.simps(3)} & $\dn$ & @{thm (rhs) L.simps(3)}\\
-(4) & @{thm (lhs) L.simps(4)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) L.simps(4)[of "r\<^sub>1" "r\<^sub>2"]}\\
+\textit{(4)} & @{thm (lhs) L.simps(4)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ &
-(5) & @{thm (lhs) L.simps(5)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) L.simps(5)[of "r\<^sub>1" "r\<^sub>2"]}\\
+@{thm (rhs) L.simps(4)[of "r\<^sub>1" "r\<^sub>2"]}\\
-(6) & @{thm (lhs) L.simps(6)} & $\dn$ & @{thm (rhs) L.simps(6)}\\
+\textit{(5)} & @{thm (lhs) L.simps(5)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ &
+@{thm (rhs) L.simps(5)[of "r\<^sub>1" "r\<^sub>2"]}\\
+\textit{(6)} & @{thm (lhs) L.simps(6)} & $\dn$ & @{thm (rhs) L.simps(6)}\\
 \end{tabular}
 \end{center}
-\noindent In clause (4) we use the operation @{term "DUMMY ;;
+\noindent In clause \textit{(4)} we use the operation @{term "DUMMY ;;
 DUMMY"} for the concatenation of two languages (it is also list-append for
 strings). We use the star-notation for regular expressions and for
 languages (in the last clause above). The star for languages is defined
 inductively by two clauses: @{text "(i)"} the empty string being in
 the star of a language and @{text "(ii)"} if @{term "s\<^sub>1"} is in a
 @{thm (lhs) nullable.simps(1)} & $\dn$ & @{thm (rhs) nullable.simps(1)}\\
 @{thm (lhs) nullable.simps(2)} & $\dn$ & @{thm (rhs) nullable.simps(2)}\\
 @{thm (lhs) nullable.simps(3)} & $\dn$ & @{thm (rhs) nullable.simps(3)}\\
 @{thm (lhs) nullable.simps(4)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) nullable.simps(4)[of "r\<^sub>1" "r\<^sub>2"]}\\
 @{thm (lhs) nullable.simps(5)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) nullable.simps(5)[of "r\<^sub>1" "r\<^sub>2"]}\\
-@{thm (lhs) nullable.simps(6)} & $\dn$ & @{thm (rhs) nullable.simps(6)}%\medskip\\
+@{thm (lhs) nullable.simps(6)} & $\dn$ & @{thm (rhs) nullable.simps(6)}\medskip\\
-\end{tabular}
-\end{center}
+%  \end{tabular}
+%  \end{center}
-\begin{center}
-\begin{tabular}{lcl}
+%  \begin{center}
+%  \begin{tabular}{lcl}
 @{thm (lhs) der.simps(1)} & $\dn$ & @{thm (rhs) der.simps(1)}\\
 @{thm (lhs) der.simps(2)} & $\dn$ & @{thm (rhs) der.simps(2)}\\
 @{thm (lhs) der.simps(3)} & $\dn$ & @{thm (rhs) der.simps(3)}\\
 @{thm (lhs) der.simps(4)[of c "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) der.simps(4)[of c "r\<^sub>1" "r\<^sub>2"]}\\
 @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]}\\
 \noindent Given the equations in \eqref{SemDer}, it is a relatively easy
 exercise in mechanical reasoning to establish that
 \begin{proposition}\label{derprop}\mbox{}\\
 \begin{tabular}{ll}
-@{text "(1)"} & @{thm (lhs) nullable_correctness} if and only if
+\textit{(1)} & @{thm (lhs) nullable_correctness} if and only if
 @{thm (rhs) nullable_correctness}, and \\
-@{text "(2)"} & @{thm[mode=IfThen] der_correctness}.
+\textit{(2)} & @{thm[mode=IfThen] der_correctness}.
 \end{tabular}
 \end{proposition}
 \noindent With this in place it is also very routine to prove that the
 regular expression matcher defined as
 @{thm (lhs) flat.simps(6)} & $\dn$ & @{thm (rhs) flat.simps(6)}\\
 @{thm (lhs) flat.simps(7)} & $\dn$ & @{thm (rhs) flat.simps(7)}\\
 \end{tabular}
 \end{center}
-\noindent Sulzmann and Lu also define inductively an inhabitation relation
+\noindent We will sometimes refer to the underlying string of a
-that associates values to regular expressions. We define this relation as
+value as \emph{flattened value}.  We will also overload our notation and
-follows:\footnote{Note that the rule for @{term Stars} differs from our
+use @{term "flats vs"} for flattening a list of values and concatenating
-earlier paper \cite{AusafDyckhoffUrban2016}. There we used the original
+the resulting strings.
-definition by Sulzmann and Lu which does not require that the values @{term "v \<in> set vs"}
-flatten to a non-empty string. The reason for introducing the
+Sulzmann and Lu define
-more restricted version of lexical values is convenience later
+inductively an \emph{inhabitation relation} that associates values to
-on when reasoning about
+regular expressions. We define this relation as
-an ordering relation for values.}
+follows:\footnote{Note that the rule for @{term Stars} differs from
+our earlier paper \cite{AusafDyckhoffUrban2016}. There we used the
+original definition by Sulzmann and Lu which does not require that
+the values @{term "v \<in> set vs"} flatten to a non-empty
+string. The reason for introducing the more restricted version of
+lexical values is convenience later on when reasoning about an
+ordering relation for values.}
 \begin{center}
 \begin{tabular}{c@ {\hspace{12mm}}c}
 \\[-8mm]
 @{thm[mode=Axiom] Prf.intros(4)} &
 \noindent
 Given a regular expression @{text r} and a string @{text s}, we define the
 set of all \emph{Lexical Values} inhabited by @{text r} with the underlying string
 being @{text s}:\footnote{Okui and Suzuki refer to our lexical values
 as \emph{canonical values} in \cite{OkuiSuzuki2010}. The notion of \emph{non-problematic
-values} by Cardelli and Frisch \cite{Frisch2004} is similar, but not identical
+values} by Cardelli and Frisch \cite{Frisch2004} is related, but not identical
 to our lexical values.}
 \begin{center}
 @{thm LV_def}
 \end{center}
 \noindent This finiteness property does not hold in general if we
 remove the side-condition about @{term "flat v \<noteq> []"} in the
 @{term Stars}-rule above. For example using Sulzmann and Lu's
 less restrictive definition, @{term "LV (STAR ONE) []"} would contain
 infinitely many values, but according to our more restricted
-definition @{thm LV_STAR_ONE_empty}.
+definition only a single value, namely @{thm LV_STAR_ONE_empty}.
 If a regular expression @{text r} matches a string @{text s}, then
 generally the set @{term "LV r s"} is not just a singleton set.  In
 case of POSIX matching the problem is to calculate the unique lexical value
 that satisfies the (informal) POSIX rules from the Introduction.
 expressions and by analysing the shape of values (corresponding to
 the derivative regular expressions).
 %
 \begin{center}
 \begin{tabular}{l@ {\hspace{5mm}}lcl}
-(1) & @{thm (lhs) injval.simps(1)} & $\dn$ & @{thm (rhs) injval.simps(1)}\\
+\textit{(1)} & @{thm (lhs) injval.simps(1)} & $\dn$ & @{thm (rhs) injval.simps(1)}\\
-(2) & @{thm (lhs) injval.simps(2)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>1"]} & $\dn$ &
+\textit{(2)} & @{thm (lhs) injval.simps(2)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>1"]} & $\dn$ &
 @{thm (rhs) injval.simps(2)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>1"]}\\
-(3) & @{thm (lhs) injval.simps(3)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>2"]} & $\dn$ &
+\textit{(3)} & @{thm (lhs) injval.simps(3)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>2"]} & $\dn$ &
 @{thm (rhs) injval.simps(3)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>2"]}\\
-(4) & @{thm (lhs) injval.simps(4)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>1" "v\<^sub>2"]} & $\dn$
+\textit{(4)} & @{thm (lhs) injval.simps(4)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>1" "v\<^sub>2"]} & $\dn$
 & @{thm (rhs) injval.simps(4)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>1" "v\<^sub>2"]}\\
-(5) & @{thm (lhs) injval.simps(5)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>1" "v\<^sub>2"]} & $\dn$
+\textit{(5)} & @{thm (lhs) injval.simps(5)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>1" "v\<^sub>2"]} & $\dn$
 & @{thm (rhs) injval.simps(5)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>1" "v\<^sub>2"]}\\
-(6) & @{thm (lhs) injval.simps(6)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>2"]} & $\dn$
+\textit{(6)} & @{thm (lhs) injval.simps(6)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>2"]} & $\dn$
 & @{thm (rhs) injval.simps(6)[of "r\<^sub>1" "r\<^sub>2" "c" "v\<^sub>2"]}\\
-(7) & @{thm (lhs) injval.simps(7)[of "r" "c" "v" "vs"]} & $\dn$
+\textit{(7)} & @{thm (lhs) injval.simps(7)[of "r" "c" "v" "vs"]} & $\dn$
 & @{thm (rhs) injval.simps(7)[of "r" "c" "v" "vs"]}\\
 \end{tabular}
 \end{center}
 \noindent To better understand what is going on in this definition it
 might be instructive to look first at the three sequence cases (clauses
-(4)--(6)). In each case we need to construct an ``injected value'' for
+\textit{(4)} -- \textit{(6)}). In each case we need to construct an ``injected value'' for
 @{term "SEQ r\<^sub>1 r\<^sub>2"}. This must be a value of the form @{term
 "Seq DUMMY DUMMY"}\,. Recall the clause of the @{text derivative}-function
 for sequence regular expressions:
 \begin{center}
 \end{center}
 \noindent Consider first the @{text "else"}-branch where the derivative is @{term
 "SEQ (der c r\<^sub>1) r\<^sub>2"}. The corresponding value must therefore
 be of the form @{term "Seq v\<^sub>1 v\<^sub>2"}, which matches the left-hand
-side in clause~(4) of @{term inj}. In the @{text "if"}-branch the derivative is an
+side in clause~\textit{(4)} of @{term inj}. In the @{text "if"}-branch the derivative is an
 alternative, namely @{term "ALT (SEQ (der c r\<^sub>1) r\<^sub>2) (der c
 r\<^sub>2)"}. This means we either have to consider a @{text Left}- or
 @{text Right}-value. In case of the @{text Left}-value we know further it
 must be a value for a sequence regular expression. Therefore the pattern
-we match in the clause (5) is @{term "Left (Seq v\<^sub>1 v\<^sub>2)"},
+we match in the clause \textit{(5)} is @{term "Left (Seq v\<^sub>1 v\<^sub>2)"},
-while in (6) it is just @{term "Right v\<^sub>2"}. One more interesting
+while in \textit{(6)} it is just @{term "Right v\<^sub>2"}. One more interesting
-point is in the right-hand side of clause (6): since in this case the
+point is in the right-hand side of clause \textit{(6)}: since in this case the
 regular expression @{text "r\<^sub>1"} does not ``contribute'' to
 matching the string, that means it only matches the empty string, we need to
 call @{const mkeps} in order to construct a value for how @{term "r\<^sub>1"}
 can match this empty string. A similar argument applies for why we can
-expect in the left-hand side of clause (7) that the value is of the form
+expect in the left-hand side of clause \textit{(7)} that the value is of the form
 @{term "Seq v (Stars vs)"}---the derivative of a star is @{term "SEQ (der c r)
 (STAR r)"}. Finally, the reason for why we can ignore the second argument
-in clause (1) of @{term inj} is that it will only ever be called in cases
+in clause \textit{(1)} of @{term inj} is that it will only ever be called in cases
 where @{term "c=d"}, but the usual linearity restrictions in patterns do
 not allow us to build this constraint explicitly into our function
 definition.\footnote{Sulzmann and Lu state this clause as @{thm (lhs)
 injval.simps(1)[of "c" "c"]} $\dn$ @{thm (rhs) injval.simps(1)[of "c"]},
 but our deviation is harmless.}
 virtue of this algorithm is that it can be implemented with ease in any
 functional programming language and also in Isabelle/HOL. In the remaining
 part of this section we prove that this algorithm is correct.
 The well-known idea of POSIX matching is informally defined by some
-rules such as the longest match and priority rule (see
+rules such as the Longest Match and Priority Rules (see
 Introduction); as correctly argued in \cite{Sulzmann2014}, this
 needs formal specification. Sulzmann and Lu define an ``ordering
 relation'' between values and argue that there is a maximum value,
 as given by the derivative-based algorithm.  In contrast, we shall
 introduce a simple inductive definition that specifies directly what
 a \emph{POSIX value} is, incorporating the POSIX-specific choices
 into the side-conditions of our rules. Our definition is inspired by
-the matching relation given by Vansummeren
+the matching relation given by Vansummeren~\cite{Vansummeren2006}.
-\cite{Vansummeren2006}. The relation we define is ternary and
+The relation we define is ternary and
 written as \mbox{@{term "s \<in> r \<rightarrow> v"}}, relating
 strings, regular expressions and values; the inductive rules are given in
 Figure~\ref{POSIXrules}.
 We can prove that given a string @{term s} and regular expression @{term
 r}, the POSIX value @{term v} is uniquely determined by @{term "s \<in> r \<rightarrow> v"}.
 "P\<star>"}-rule. Also there we want that @{term "s\<^sub>1"} is the longest initial
 split of @{term "s\<^sub>1 @ s\<^sub>2"} and furthermore the corresponding value
 @{term v} cannot be flattened to the empty string. In effect, we require
 that in each ``iteration'' of the star, some non-empty substring needs to
 be ``chipped'' away; only in case of the empty string we accept @{term
-"Stars []"} as the POSIX value. Indeed we can show that our POSIX value
+"Stars []"} as the POSIX value. Indeed we can show that our POSIX values
-is a lexical value which excludes those @{text Stars} containing subvalues
+are lexical values which exclude those @{text Stars} that contain subvalues
 that flatten to the empty string.
 \begin{lemma}\label{LVposix}
 @{thm [mode=IfThen] Posix_LV}
 \end{lemma}
 \begin{proof}
 By induction on @{term s} using Lemma~\ref{lemmkeps} and \ref{Posix2}.\qed
 \end{proof}
-\noindent In (2) we further know by Theorem~\ref{posixdeterm} that the
+\noindent In \textit{(2)} we further know by Theorem~\ref{posixdeterm} that the
 value returned by the lexer must be unique.   A simple corollary
 of our two theorems is:
 \begin{corollary}\mbox{}\smallskip\\\label{lexercorrectcor}
 \begin{tabular}{ll}
 as the maximal elements.  An extended version of \cite{Sulzmann2014}
 is available at the website of its first author; this includes more
 details of their proofs, but which are evidently not in final form
 yet. Unfortunately, we were not able to verify claims that their
 ordering has properties such as being transitive or having maximal
 elements.
 Okui and Suzuki \cite{OkuiSuzuki2010,OkuiSuzukiTech} described
 another ordering of values, which they use to establish the
 correctness of their automata-based algorithm for POSIX matching.
 Their ordering resembles some aspects of the one given by Sulzmann
-and Lu, but is quite different. To begin with, Okui and Suzuki
+and Lu, but overall is quite different. To begin with, Okui and
-identify POSIX values as minimal, rather than maximal, elements in
+Suzuki identify POSIX values as minimal, rather than maximal,
-their ordering. A more substantial difference is that the ordering
+elements in their ordering. A more substantial difference is that
-by Okui and Suzuki uses \emph{positions} in order to identify and
+the ordering by Okui and Suzuki uses \emph{positions} in order to
-compare subvalues. Positions are lists of natural numbers. This
+identify and compare subvalues. Positions are lists of natural
-allows them to quite naturally formalise the Longest Match and
+numbers. This allows them to quite naturally formalise the Longest
-Priority rules of the informal POSIX standard.  Consider for example
+Match and Priority rules of the informal POSIX standard.  Consider
-the value @{term v}
+for example the value @{term v}
 \begin{center}
 @{term "v == Stars [Seq (Char x) (Char y), Char z]"}
 \end{center}
 \noindent
 At position @{text "[0,1]"} of this value is the
 subvalue @{text "Char y"} and at position @{text "[1]"} the
 subvalue @{term "Char z"}.  At the `root' position, or empty list
-@{term "[]"}, is the whole value @{term v}. The positions @{text
+@{term "[]"}, is the whole value @{term v}. Positions such as @{text
-"[0,1,0]"} and @{text "[2]"}, for example, are outside of @{text
+"[0,1,0]"} or @{text "[2]"} are outside of @{text
 v}. If it exists, the subvalue of @{term v} at a position @{text
 p}, written @{term "at v p"}, can be recursively defined by
 \begin{center}
 \begin{tabular}{r@ {\hspace{0mm}}lcl}
 \end{tabular}
 \end{center}
 \noindent In the last clause we use Isabelle's notation @{term "vs ! n"} for the
 @{text n}th element in a list.  The set of positions inside a value @{text v},
-written @{term "Pos v"}, is given by the clauses
+written @{term "Pos v"}, is given by
 \begin{center}
 \begin{tabular}{lcl}
 @{thm (lhs) Pos.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) Pos.simps(1)}\\
 @{thm (lhs) Pos.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) Pos.simps(2)}\\
 @{thm (lhs) Pos_stars} & @{text "\<equiv>"} & @{thm (rhs) Pos_stars}\\
 \end{tabular}
 \end{center}
 \noindent
-whereby @{text len} stands for the length of a list. Clearly
+whereby @{text len} in the last clause stands for the length of a list. Clearly
 for every position inside a value there exists a subvalue at that position.
 To help understanding the ordering of Okui and Suzuki, consider again
 the earlier value
 @{term "w == Stars [Char x, Char y, Char z]"}
 \end{tabular}
 \end{center}
 \noindent Both values match the string @{text "xyz"}, that means if
-we flatten the values at their respective root position, we obtain
+we flatten these values at their respective root position, we obtain
 @{text "xyz"}. However, at position @{text "[0]"}, @{text v} matches
 @{text xy} whereas @{text w} matches only the shorter @{text x}. So
 according to the Longest Match Rule, we should prefer @{text v},
 rather than @{text w} as POSIX value for string @{text xyz} (and
 corresponding regular expression). In order to
 \begin{center}
 @{term "v == Left (Char x)"} \qquad and \qquad @{term "w == Right (Char x)"}
 \end{center}
-\noindent Both values match @{text x}, but at position @{text "[0]"}
+\noindent Both values match @{text x}. At position @{text "[0]"}
 the norm of @{term v} is @{text 1} (the subvalue matches @{text x}),
 but the norm of @{text w} is @{text "-1"} (the position is outside
 @{text w} according to how we defined the `inside' positions of
 @{text Left}- and @{text Right}-values).  Of course at position
 @{text "[1]"}, the norms @{term "pflat_len v [1]"} and @{term
 "pflat_len w [1]"} are reversed, but the point is that subvalues
 will be analysed according to lexicographically ordered
 positions. According to this ordering, the position @{text "[0]"}
-takes precedence.  The lexicographic ordering of positions, written
+takes precedence over @{text "[1]"} and thus also @{text v} will be
+preferred over @{text w}.  The lexicographic ordering of positions, written
 @{term "DUMMY \<sqsubset>lex DUMMY"}, can be conveniently formalised
 by three inference rules
 \begin{center}
 \begin{tabular}{ccc}
 \end{tabular}
 \end{center}
 With the norm and lexicographic order in place,
 we can state the key definition of Okui and Suzuki
-\cite{OkuiSuzuki2010}: a value @{term "v\<^sub>1"} is \emph{smaller} than
+\cite{OkuiSuzuki2010}: a value @{term "v\<^sub>1"} is \emph{smaller at position @{text p}} than
-@{term "v\<^sub>2"} if and only if  $(i)$ the norm at position @{text p} is
+@{term "v\<^sub>2"}, written @{term "v\<^sub>1 \<sqsubset>val p v\<^sub>2"},
+if and only if  $(i)$ the norm at position @{text p} is
 greater in @{term "v\<^sub>1"} (that is the string @{term "flat (at v\<^sub>1 p)"} is longer
 than @{term "flat (at v\<^sub>2 p)"}) and $(ii)$ all subvalues at
 positions that are inside @{term "v\<^sub>1"} or @{term "v\<^sub>2"} and that are
 lexicographically smaller than @{text p}, we have the same norm, namely
 \end{center}
 While we encountered a number of obstacles for establishing properties like
 transitivity for the ordering of Sulzmann and Lu (and which we failed
 to overcome), it is relatively straightforward to establish this
-property for the ordering by Okui and Suzuki.
+property for the orderings
+@{term "DUMMY :\<sqsubset>val DUMMY"} and @{term "DUMMY :\<sqsubseteq>val DUMMY"}
+by Okui and Suzuki.
 \begin{lemma}[Transitivity]\label{transitivity}
 @{thm [mode=IfThen] PosOrd_trans[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2" and ?v3.0="v\<^sub>3"]}
 \end{lemma}
 and @{text q}, where the values @{text "v\<^sub>1"} and @{text
 "v\<^sub>2"} (respectively @{text "v\<^sub>2"} and @{text
 "v\<^sub>3"}) are `distinct'.  Since @{text
 "\<prec>\<^bsub>lex\<^esub>"} is trichotomous, we need to consider
 three cases, namely @{term "p = q"}, @{term "p \<sqsubset>lex q"} and
-@{term "q \<sqsubset>lex p"}. Let us look at the first case.
+@{term "q \<sqsubset>lex p"}. Let us look at the first case.  Clearly
-Clearly @{term "pflat_len v\<^sub>2 p < pflat_len v\<^sub>1 p"}
+@{term "pflat_len v\<^sub>2 p < pflat_len v\<^sub>1 p"} and @{term
-and @{term "pflat_len v\<^sub>3 p < pflat_len v\<^sub>2 p"}
+"pflat_len v\<^sub>3 p < pflat_len v\<^sub>2 p"} imply @{term
-imply @{term "pflat_len v\<^sub>3 p < pflat_len v\<^sub>1 p"}.
+"pflat_len v\<^sub>3 p < pflat_len v\<^sub>1 p"}.  It remains to show
-It remains to show for a @{term "p' \<in> Pos v\<^sub>1 \<union> Pos v\<^sub>3"}
+that for a @{term "p' \<in> Pos v\<^sub>1 \<union> Pos v\<^sub>3"}
-with @{term "p' \<sqsubset>lex p"} that
+with @{term "p' \<sqsubset>lex p"} that @{term "pflat_len v\<^sub>1
-@{term "pflat_len v\<^sub>1 p' = pflat_len v\<^sub>3 p'"} holds.
+p' = pflat_len v\<^sub>3 p'"} holds.  Suppose @{term "p' \<in> Pos
-Suppose @{term "p' \<in> Pos v\<^sub>1"}, then we can infer from the
+v\<^sub>1"}, then we can infer from the first assumption that @{term
-first assumption that @{term "pflat_len v\<^sub>1 p' = pflat_len v\<^sub>2 p'"}.
+"pflat_len v\<^sub>1 p' = pflat_len v\<^sub>2 p'"}.  But this means
-But this means that @{term "p'"} must be in  @{term "Pos v\<^sub>2"} too.
+that @{term "p'"} must be in @{term "Pos v\<^sub>2"} too (the norm
-Hence we can use the second assumption and infer  @{term "pflat_len v\<^sub>2 p' = pflat_len v\<^sub>3 p'"}, which concludes
+cannot be @{text "-1"} given @{term "p' \<in> Pos v\<^sub>1"}).
-this case with @{term "v\<^sub>1 :\<sqsubset>val v\<^sub>3"}.
+Hence we can use the second assumption and
-The reasoning in the other cases is similar.\qed
+infer @{term "pflat_len v\<^sub>2 p' = pflat_len v\<^sub>3 p'"},
+which concludes this case with @{term "v\<^sub>1 :\<sqsubset>val
+v\<^sub>3"}.  The reasoning in the other cases is similar.\qed
 \end{proof}
-\noindent It is straightforward to show that @{text "\<prec>"} and
+\noindent
-$\preccurlyeq$ are partial orders.  Okui and Suzuki also show that it
+The proof for $\preccurlyeq$ is similar and omitted.
-is a linear order for lexical values \cite{OkuiSuzuki2010} of a given
+It is also straightforward to show that @{text "\<prec>"} and
-regular expression and given string, but we have not done this. It is
+$\preccurlyeq$ are partial orders.  Okui and Suzuki furthermore show that they
+are linear orderings for lexical values \cite{OkuiSuzuki2010} of a given
+regular expression and given string, but we have not formalised this in Isabelle. It is
 not essential for our results. What we are going to show below is
-that for a given @{text r} and @{text s}, the ordering has a unique
+that for a given @{text r} and @{text s}, the orderings have a unique
 minimal element on the set @{term "LV r s"}, which is the POSIX value
-we defined in the previous section.
+we defined in the previous section. We start with two properties that
+show how the length of a flattened value relates to the @{text "\<prec>"}-ordering.
-Lemma 1
+\begin{proposition}\mbox{}\smallskip\\\label{ordlen}
+\begin{tabular}{@ {}ll}
-@{thm [mode=IfThen] PosOrd_shorterE[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}
+(1) &
+@{thm [mode=IfThen] PosOrd_shorterE[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}\\
-but in the other direction only
+(2) &
 @{thm [mode=IfThen] PosOrd_shorterI[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}
+\end{tabular}
+\end{proposition}
+\noindent Both properties follow from the definition of the ordering. Note that
-Next we establish how Okui and Suzuki's ordering relates to our
+\textit{(2)} entails that a value, say @{term "v\<^sub>2"}, whose underlying
+string is a strict prefix of another flattened value, say @{term "v\<^sub>1"}, then
+@{term "v\<^sub>1"} must be smaller than @{term "v\<^sub>2"}. For our proofs it
+will be useful to have the following properties---in each case the underlying strings
+of the compared values are the same:
+\begin{proposition}\mbox{}\smallskip\\\label{ordintros}
+\begin{tabular}{ll}
+\textit{(1)} &
+@{thm [mode=IfThen] PosOrd_Left_Right[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}\\
+\textit{(2)} & If
+@{thm (prem 1) PosOrd_Left_eq[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]} \;then\;
+@{thm (lhs) PosOrd_Left_eq[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]} \;iff\;
+@{thm (rhs) PosOrd_Left_eq[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}\\
+\textit{(3)} & If
+@{thm (prem 1) PosOrd_Right_eq[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]} \;then\;
+@{thm (lhs) PosOrd_Right_eq[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]} \;iff\;
+@{thm (rhs) PosOrd_Right_eq[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}\\
+\textit{(4)} & If
+@{thm (prem 1) PosOrd_Seq_eq[where ?v2.0="v\<^sub>2" and ?w2.0="w\<^sub>2"]} \;then\;
+@{thm (lhs) PosOrd_Seq_eq[where ?v2.0="v\<^sub>2" and ?w2.0="w\<^sub>2"]} \;iff\;
+@{thm (rhs) PosOrd_Seq_eq[where ?v2.0="v\<^sub>2" and ?w2.0="w\<^sub>2"]}\\
+\textit{(5)} & If
+@{thm (prem 2) PosOrd_SeqI1[simplified, where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2" and
+?w1.0="w\<^sub>1" and ?w2.0="w\<^sub>2"]} \;and\;
+@{thm (prem 1) PosOrd_SeqI1[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2" and
+?w1.0="w\<^sub>1" and ?w2.0="w\<^sub>2"]} \;then\;
+@{thm (concl) PosOrd_SeqI1[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2" and
+?w1.0="w\<^sub>1" and ?w2.0="w\<^sub>2"]}\\
+\textit{(6)} & If
+@{thm (prem 1) PosOrd_Stars_append_eq[where ?vs1.0="vs\<^sub>1" and ?vs2.0="vs\<^sub>2"]} \;then\;
+@{thm (lhs) PosOrd_Stars_append_eq[where ?vs1.0="vs\<^sub>1" and ?vs2.0="vs\<^sub>2"]} \;iff\;
+@{thm (rhs) PosOrd_Stars_append_eq[where ?vs1.0="vs\<^sub>1" and ?vs2.0="vs\<^sub>2"]}\\
+\textit{(7)} & If
+@{thm (prem 2) PosOrd_StarsI[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2" and
+?vs1.0="vs\<^sub>1" and ?vs2.0="vs\<^sub>2"]} \;and\;
+@{thm (prem 1) PosOrd_StarsI[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2" and
+?vs1.0="vs\<^sub>1" and ?vs2.0="vs\<^sub>2"]} \;then\;
+@{thm (concl) PosOrd_StarsI[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2" and
+?vs1.0="vs\<^sub>1" and ?vs2.0="vs\<^sub>2"]}\\
+\end{tabular}
+\end{proposition}
+\noindent One might prefer that statements \textit{(4)} and \textit{(5)}
+(respectively \textit{(6)} and \textit{(7)})
+are combined into a single \textit{iff}-statement (like the ones for @{text
+Left} and @{text Right}). Unfortunately this cannot be done easily: such
+a single statement would require an additional assumption about the
+two values @{term "Seq v\<^sub>1 v\<^sub>2"} and @{term "Seq w\<^sub>1 w\<^sub>2"}
+being inhabited by the same regular expression. The
+complexity of the proofs involved seems to not justify such a
+`cleaner' single statement. The statements given are just the properties that
+allow us to establish our theorems. The proofs for Proposition~\ref{ordintros}
+are routine.
+Next we establish how Okui and Suzuki's orderings relate to our
 definition of POSIX values.  Given a @{text POSIX} value @{text "v\<^sub>1"}
 for @{text r} and @{text s}, then any other lexical value @{text
 "v\<^sub>2"} in @{term "LV r s"} is greater or equal than @{text
 "v\<^sub>1"}, namely:
 \begin{itemize}
 \item[$\bullet$] Case @{text "P+L"} with @{term "s \<in> (ALT r\<^sub>1 r\<^sub>2)
-\<rightarrow> (Left w\<^sub>1)"}: In this case @{term "v\<^sub>1 =
+\<rightarrow> (Left w\<^sub>1)"}: In this case the value
-Left w\<^sub>1"} and the value @{term "v\<^sub>2"} is either of the
+@{term "v\<^sub>2"} is either of the
 form @{term "Left w\<^sub>2"} or @{term "Right w\<^sub>2"}. In the
-latter case we can immediately conclude with @{term "v\<^sub>1
+latter case we can immediately conclude with \mbox{@{term "v\<^sub>1
-:\<sqsubseteq>val v\<^sub>2"} since a @{text Left}-value with the
+:\<sqsubseteq>val v\<^sub>2"}} since a @{text Left}-value with the
 same underlying string @{text s} is always smaller than a
-@{text Right}-value.  In the former case we have @{term "w\<^sub>2
+@{text Right}-value by Proposition~\ref{ordintros}\textit{(1)}.
+In the former case we have @{term "w\<^sub>2
 \<in> LV r\<^sub>1 s"} and can use the induction hypothesis to infer
 @{term "w\<^sub>1 :\<sqsubseteq>val w\<^sub>2"}. Because @{term
 "w\<^sub>1"} and @{term "w\<^sub>2"} have the same underlying string
 @{text s}, we can conclude with @{term "Left w\<^sub>1
-:\<sqsubseteq>val Left w\<^sub>2"} by Prop ???.\smallskip
+:\<sqsubseteq>val Left w\<^sub>2"} using
+Proposition~\ref{ordintros}\textit{(2)}.\smallskip
 \item[$\bullet$] Case @{text "P+R"} with @{term "s \<in> (ALT r\<^sub>1 r\<^sub>2)
 \<rightarrow> (Right w\<^sub>1)"}: This case similar to the previous
 case, except that we additionally know @{term "s \<notin> L
 r\<^sub>1"}. This is needed when @{term "v\<^sub>2"} is of the form
-@{term "Left w\<^sub>2"}. Since \mbox{@{term "flat v\<^sub>2 = flat
+\mbox{@{term "Left w\<^sub>2"}}. Since \mbox{@{term "flat v\<^sub>2 = flat
 w\<^sub>2"} @{text "= s"}} and @{term "\<Turnstile> w\<^sub>2 :
 r\<^sub>1"}, we can derive a contradiction for \mbox{@{term "s \<notin> L
 r\<^sub>1"}} using
 Proposition~\ref{inhabs}. So also in this case \mbox{@{term "v\<^sub>1
 :\<sqsubseteq>val v\<^sub>2"}}.\smallskip
-\item[$\bullet$] Case @{text "PS"} with @{term "(s\<^sub>1 @ s\<^sub>2) \<in> (SEQ
+\item[$\bullet$] Case @{text "PS"} with @{term "(s\<^sub>1 @
-r\<^sub>1 r\<^sub>2) \<rightarrow> (Seq w\<^sub>1 w\<^sub>2)"}: We
+s\<^sub>2) \<in> (SEQ r\<^sub>1 r\<^sub>2) \<rightarrow> (Seq
-can assume @{term "v\<^sub>2 = Seq (u\<^sub>1) (u\<^sub>2)"} with
+w\<^sub>1 w\<^sub>2)"}: We can assume @{term "v\<^sub>2 = Seq
-@{term "\<Turnstile> u\<^sub>1 : r\<^sub>1"} and \mbox{@{term
+(u\<^sub>1) (u\<^sub>2)"} with @{term "\<Turnstile> u\<^sub>1 :
-"\<Turnstile> u\<^sub>2 : r\<^sub>2"}}. We have @{term "s\<^sub>1 @
+r\<^sub>1"} and \mbox{@{term "\<Turnstile> u\<^sub>2 :
-s\<^sub>2 = (flat u\<^sub>1) @ (flat u\<^sub>2)"}.  By the
+r\<^sub>2"}}. We have @{term "s\<^sub>1 @ s\<^sub>2 = (flat
-side-condition of the @{text PS}-rule we know that either @{term
+u\<^sub>1) @ (flat u\<^sub>2)"}.  By the side-condition of the
-"s\<^sub>1 = flat u\<^sub>1"} or that @{term "flat u\<^sub>1"} is a
+@{text PS}-rule we know that either @{term "s\<^sub>1 = flat
-strict prefix ??? of @{term "s\<^sub>1"}. In the latter case we can
+u\<^sub>1"} or that @{term "flat u\<^sub>1"} is a strict prefix of
-infer @{term "w\<^sub>1 :\<sqsubset>val u\<^sub>1"} by ???  and from
+@{term "s\<^sub>1"}. In the latter case we can infer @{term
-this @{term "v\<^sub>1 :\<sqsubseteq>val v\<^sub>2"} by ???.  In the
+"w\<^sub>1 :\<sqsubset>val u\<^sub>1"} by
-former case we know @{term "u\<^sub>1 \<in> LV r\<^sub>1 s\<^sub>1"}
+Proposition~\ref{ordlen}\textit{(2)} and from this @{term "v\<^sub>1
-and @{term "u\<^sub>2 \<in> LV r\<^sub>2 s\<^sub>2"}. With this we
+:\<sqsubseteq>val v\<^sub>2"} by Proposition~\ref{ordintros}\textit{(5)}
-can use the induction hypotheses to infer @{term "w\<^sub>1
+(as noted above @{term "v\<^sub>1"} and @{term "v\<^sub>2"} must have the
-:\<sqsubseteq>val u\<^sub>1"} and @{term "w\<^sub>2
+same underlying string).
-:\<sqsubseteq>val u\<^sub>2"}. By ??? we can again infer @{term
+In the former case we know
-"v\<^sub>1 :\<sqsubseteq>val v\<^sub>2"}.
+@{term "u\<^sub>1 \<in> LV r\<^sub>1 s\<^sub>1"} and @{term
+"u\<^sub>2 \<in> LV r\<^sub>2 s\<^sub>2"}. With this we can use the
+induction hypotheses to infer @{term "w\<^sub>1 :\<sqsubseteq>val
+u\<^sub>1"} and @{term "w\<^sub>2 :\<sqsubseteq>val u\<^sub>2"}. By
+Proposition~\ref{ordintros}\textit{(4,5)} we can again infer
+@{term "v\<^sub>1 :\<sqsubseteq>val
+v\<^sub>2"}.
 \end{itemize}
 \noindent The case for @{text "P\<star>"} is similar to the @{text PS}-case and omitted.\qed
 \end{proof}
 \noindent This theorem shows that our @{text POSIX} value for a
 regular expression @{text r} and string @{term s} is in fact a
-minimal element of the values in @{text "LV r s"}. By ??? we also
+minimal element of the values in @{text "LV r s"}. By
-know that any value in @{text "LV s' r"}, with @{term "s'"} being a
+Proposition~\ref{ordlen}\textit{(2)} we also know that any value in
-prefix, cannot be smaller than @{text "v\<^sub>1"}. The next theorem
+@{text "LV s' r"}, with @{term "s'"} being a strict prefix, cannot be
-shows the opposite---namely any minimal element in @{term "LV r s"}
+smaller than @{text "v\<^sub>1"}. The next theorem shows the
-must be a @{text POSIX} value. For this it helps that we proved in
+opposite---namely any minimal element in @{term "LV r s"} must be a
-the previous section that whenever a string @{term "s \<in> L r"} then
+@{text POSIX} value. This can be established by induction on @{text
-a corresponding @{text POSIX} value exists and is a lexical value,
+r}, but the proof can be drastically simplified by using the fact
-see Theorem ??? and Lemma ???.
+from the previous section about the existence of a @{text POSIX} value
+whenever a string @{term "s \<in> L r"}.
 \begin{theorem}
 @{thm [mode=IfThen] PosOrd_Posix[where ?v1.0="v\<^sub>1"]}
 \end{theorem}
 \begin{proof}
 If @{thm (prem 1) PosOrd_Posix[where ?v1.0="v\<^sub>1"]} then
 @{term "s \<in> L r"} by Proposition~\ref{inhabs}. Hence by Theorem~\ref{lexercorrect}(2)
 there exists a
 @{text POSIX} value @{term "v\<^sub>P"} with @{term "s \<in> r \<rightarrow> v\<^sub>P"}
-and by Lemma~\ref{LVposix} we also have @{term "v\<^sub>P \<in> LV r s"}.
+and by Lemma~\ref{LVposix} we also have \mbox{@{term "v\<^sub>P \<in> LV r s"}}.
 By Theorem~\ref{orderone} we therefore have
 @{term "v\<^sub>P :\<sqsubseteq>val v\<^sub>1"}. If @{term "v\<^sub>P = v\<^sub>1"} then
-we are done. Otherwise we have @{term "v\<^sub>P :\<sqsubset>val v\<^sub>1"} which
+we are done. Otherwise we have @{term "v\<^sub>P :\<sqsubset>val v\<^sub>1"}, which
-however contradicts the second assumption and we are done too.\qed
+however contradicts the second assumption about @{term "v\<^sub>1"} being the smallest
+element in @{term "LV r s"}. So we are done in this case too.\qed
 \end{proof}
+\noindent
+From this we can also show
+that if @{term "LV r s"} is non-empty (or equivalently @{term "s \<in> L r"}) then
+it has a unique minimal element:
 \begin{corollary}
 @{thm [mode=IfThen] Least_existence1}
 \end{corollary}
-\noindent To sum up, we have shown that minimal elements according
-to the ordering by Okui and Suzuki are exactly the @{text POSIX}
-values we defined inductively in Section~\ref{posixsec}
+\noindent To sum up, we have shown that the (unique) minimal elements
+of the ordering by Okui and Suzuki are exactly the @{text POSIX}
+values we defined inductively in Section~\ref{posixsec}. This provides
-IS THE minimal element unique? We have not shown totality.
+an independent confirmation that our ternary relation formalise the
+informal POSIX rules.
 *}
 section {* Optimisations *}
 text {*
 text {*
 A strong point in favour of
 Sulzmann and Lu's algorithm is that it can be extended in various
-ways.
+ways.  If we are interested in tokenising a string, then we need to not just
-If we are interested in tokenising a string, then we need to not just
 split up the string into tokens, but also ``classify'' the tokens (for
 example whether it is a keyword or an identifier). This can be done with
 only minor modifications to the algorithm by introducing \emph{record
 regular expressions} and \emph{record values} (for example
 \cite{Sulzmann2014b}):

changeset 273	e85099ac4c6c
parent 272	f16019b11179
child 274	692b62426677