lexing: comparison thys/Journal/Paper.thy

equal deleted inserted replaced

-:e2828c4a1e23
+:d36be1e356c0
 (*<*)
 theory Paper
 imports
 "../Lexer"
 "../Simplifying"
-"../Sulzmann"
+"../Positions"
 "~~/src/HOL/Library/LaTeXsugar"
 begin
+lemma Suc_0_fold:
+"Suc 0 = 1"
+by simp
 declare [[show_question_marks = false]]
 abbreviation
 "der_syn r c \<equiv> der c r"
 abbreviation
 "ders_syn r s \<equiv> ders s r"
+abbreviation
+"nprec v1 v2 \<equiv> \<not>(v1 :\<sqsubset>val v2)"
 notation (latex output)
 If  ("(\<^raw:\textrm{>if\<^raw:}> (_)/ \<^raw:\textrm{>then\<^raw:}> (_)/ \<^raw:\textrm{>else\<^raw:}> (_))" 10) and
 Cons ("_\<^raw:\mbox{$\,$}>::\<^raw:\mbox{$\,$}>_" [75,73] 73) and
 CHAR ("_" [1000] 80) and
 ALT ("_ + _" [77,77] 78) and
 SEQ ("_ \<cdot> _" [77,77] 78) and
 STAR ("_\<^sup>\<star>" [1000] 78) and
-val.Void ("'(')" 1000) and
+val.Void ("Empty" 78) and
 val.Char ("Char _" [1000] 78) and
 val.Left ("Left _" [79] 78) and
 val.Right ("Right _" [1000] 78) and
 val.Seq ("Seq _ _" [79,79] 78) and
 val.Stars ("Stars _" [79] 78) and
 F_SEQ ("F\<^bsub>Seq\<^esub> _ _") and
 simp_SEQ ("simp\<^bsub>Seq\<^esub> _ _" [1000, 1000] 1) and
 simp_ALT ("simp\<^bsub>Alt\<^esub> _ _" [1000, 1000] 1) and
 slexer ("lexer\<^sup>+" 1000) and
+at ("_\<^raw:\mbox{$\downharpoonleft$}>\<^bsub>_\<^esub>") and
+lex_list ("_ \<prec>\<^bsub>lex\<^esub> _") and
+PosOrd ("_ \<prec>\<^bsub>_\<^esub> _" [77,77,77] 77) and
+PosOrd_ex ("_ \<prec> _" [77,77] 77) and
+PosOrd_ex1 ("_ \<^raw:\mbox{$\preccurlyeq$}> _" [77,77] 77) and
+pflat_len ("\<parallel>_\<parallel>\<^bsub>_\<^esub>") and
+nprec ("_ \<^raw:\mbox{$\not\prec$}> _" [77,77] 77)
+(*
 ValOrd ("_ >\<^bsub>_\<^esub> _" [77,77,77] 77) and
 ValOrdEq ("_ \<ge>\<^bsub>_\<^esub> _" [77,77,77] 77)
+*)
 definition
 "match r s \<equiv> nullable (ders s r)"
 (*
 If a regular expression matches a string, then in general there is more than
 one way of how the string is matched. There are two commonly used
 disambiguation strategies to generate a unique answer: one is called GREEDY
 matching \cite{Frisch2004} and the other is POSIX
-matching~\cite{Kuklewicz,Sulzmann2014,Vansummeren2006}. For example consider
+matching~\cite{POSIX,Kuklewicz,OkuiSuzuki2013,Sulzmann2014,Vansummeren2006}. For example consider
 the string @{term xy} and the regular expression \mbox{@{term "STAR (ALT
 (ALT x y) xy)"}}. Either the string can be matched in two `iterations' by
 the single letter-regular expressions @{term x} and @{term y}, or directly
 in one iteration by @{term xy}. The first case corresponds to GREEDY
 matching, which first matches with the left-most symbol and only matches the
 In the context of lexing, where an input string needs to be split up into a
 sequence of tokens, POSIX is the more natural disambiguation strategy for
 what programmers consider basic syntactic building blocks in their programs.
 These building blocks are often specified by some regular expressions, say
 @{text "r\<^bsub>key\<^esub>"} and @{text "r\<^bsub>id\<^esub>"} for recognising keywords and
-identifiers, respectively. There are two underlying (informal) rules behind
+identifiers, respectively. There are a few underlying (informal) rules behind
-tokenising a string in a POSIX fashion according to a collection of regular
+tokenising a string in a POSIX \cite{POSIX} fashion according to a collection of regular
 expressions:
 \begin{itemize}
-\item[$\bullet$] \emph{The Longest Match Rule} (or \emph{``maximal munch rule''}):
+\item[$\bullet$] \emph{The Longest Match Rule} (or \emph{``{M}aximal {M}unch {R}ule''}):
 The longest initial substring matched by any regular expression is taken as
 next token.\smallskip
 \item[$\bullet$] \emph{Priority Rule:}
-For a particular longest initial substring, the first regular expression
+For a particular longest initial substring, the first (leftmost) regular expression
-that can match determines the token.
+that can match determines the token.\smallskip
+\item[$\bullet$] \emph{Star Rule:} A subexpression repeated by ${}^\star$ shall
+not match an empty string unless this is the only match for the repetition.\smallskip
+\item[$\bullet$] \emph{Empty String Rule:} An empty string shall be considered to
+be longer than no match at all.
 \end{itemize}
 \noindent Consider for example a regular expression @{text "r\<^bsub>key\<^esub>"} for recognising keywords
 such as @{text "if"}, @{text "then"} and so on; and @{text "r\<^bsub>id\<^esub>"}
 recognising identifiers (say, a single character followed by
 r}, and to show that (once a string to be matched is chosen) there is
 a maximum element and that it is computed by their derivative-based
 algorithm. This proof idea is inspired by work of Frisch and Cardelli
 \cite{Frisch2004} on a GREEDY regular expression matching
 algorithm. However, we were not able to establish transitivity and
-totality for the ``order relation'' by Sulzmann and Lu. In Section
+totality for the ``order relation'' by Sulzmann and Lu. ??In Section
 \ref{argu} we identify some inherent problems with their approach (of
 which some of the proofs are not published in \cite{Sulzmann2014});
 perhaps more importantly, we give a simple inductive (and
 algorithm-independent) definition of what we call being a {\em POSIX
 value} for a regular expression @{term r} and a string @{term s}; we
 conclude in this case too.\qed
 \end{proof}
 *}
+section {* Ordering of Values according to Okui and Suzuki*}
+text {*
+Positions are lists of natural numbers.
+The subvalue at position @{text p}, written @{term "at v p"}, is defined
+\begin{center}
+\begin{tabular}{r@ {\hspace{0mm}}lcl}
+@{term v} &  @{text "\<downharpoonleft>\<^bsub>[]\<^esub>"} & @{text "\<equiv>"}& @{thm (rhs) at.simps(1)}\\
+@{term "Left v"} & @{text "\<downharpoonleft>\<^bsub>0::ps\<^esub>"} & @{text "\<equiv>"}& @{thm (rhs) at.simps(2)}\\
+@{term "Right v"} & @{text "\<downharpoonleft>\<^bsub>1::ps\<^esub>"} & @{text "\<equiv>"} &
+@{thm (rhs) at.simps(3)[simplified Suc_0_fold]}\\
+@{term "Seq v\<^sub>1 v\<^sub>2"} & @{text "\<downharpoonleft>\<^bsub>0::ps\<^esub>"} & @{text "\<equiv>"} &
+@{thm (rhs) at.simps(4)[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]} \\
+@{term "Seq v\<^sub>1 v\<^sub>2"} & @{text "\<downharpoonleft>\<^bsub>1::ps\<^esub>"}
+& @{text "\<equiv>"} &
+@{thm (rhs) at.simps(5)[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2", simplified Suc_0_fold]} \\
+@{term "Stars vs"} & @{text "\<downharpoonleft>\<^bsub>n::ps\<^esub>"} & @{text "\<equiv>"}& @{thm (rhs) at.simps(6)}\\
+\end{tabular}
+\end{center}
+\begin{center}
+\begin{tabular}{lcl}
+@{thm (lhs) Pos.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) Pos.simps(1)}\\
+@{thm (lhs) Pos.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) Pos.simps(2)}\\
+@{thm (lhs) Pos.simps(3)} & @{text "\<equiv>"} & @{thm (rhs) Pos.simps(3)}\\
+@{thm (lhs) Pos.simps(4)} & @{text "\<equiv>"} & @{thm (rhs) Pos.simps(4)}\\
+@{thm (lhs) Pos.simps(5)[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}
+& @{text "\<equiv>"}
+& @{thm (rhs) Pos.simps(5)[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}\\
+@{thm (lhs) Pos_stars} & @{text "\<equiv>"} & @{thm (rhs) Pos_stars}\\
+\end{tabular}
+\end{center}
+@{thm pflat_len_def}
+\begin{center}
+\begin{tabular}{ccc}
+@{thm [mode=Axiom] lex_list.intros(1)}\hspace{1cm} &
+@{thm [mode=Rule] lex_list.intros(2)[where ?ps1.0="ps\<^sub>1" and ?ps2.0="ps\<^sub>2"]}\hspace{1cm} &
+@{thm [mode=Rule] lex_list.intros(3)[where ?p1.0="p\<^sub>1" and ?p2.0="p\<^sub>2" and
+?ps1.0="ps\<^sub>1" and ?ps2.0="ps\<^sub>2"]}
+\end{tabular}
+\end{center}
+Main definition by Okui and Suzuki.
+\begin{center}
+\begin{tabular}{ccl}
+@{thm (lhs) PosOrd_def[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]} &
+@{text "\<equiv>"} &
+@{term "pflat_len v\<^sub>1 p > pflat_len v\<^sub>2 p"}   and\\
+& & @{term "(\<forall>q \<in> Pos v\<^sub>1 \<union> Pos v\<^sub>2. q \<sqsubset>lex p --> pflat_len v\<^sub>1 q = pflat_len v\<^sub>2 q)"}
+\end{tabular}
+\end{center}
+@{thm PosOrd_ex_def[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}
+@{thm PosOrd_ex1_def[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}
+Lemma 1
+@{thm [mode=IfThen] PosOrd_shorterE[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}
+but in the other direction only
+@{thm [mode=IfThen] PosOrd_shorterI[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}
+Lemma Transitivity:
+@{thm [mode=IfThen] PosOrd_trans[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2" and ?v3.0="v\<^sub>3"]}
+*}
+text {*
+Theorems:
+@{thm [mode=IfThen] Posix_CPT}
+@{thm [mode=IfThen] Posix_PosOrd[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}
+Corrollary from the last one
+@{thm [mode=IfThen] Posix_PosOrd_stronger[where ?v1.0="v\<^sub>1" and ?v2.0="v\<^sub>2"]}
+Theorem
+@{thm [mode=IfThen] PosOrd_Posix[where ?v1.0="v\<^sub>1"]}
+*}
 section {* The Correctness Argument by Sulzmann and Lu\label{argu} *}
 text {*
 %  \newcommand{\greedy}{\succcurlyeq_{gr}}
 \newcommand{\posix}{>}
 final form, we make no comment thereon, preferring to give general reasons
 for our belief that the approach of \cite{Sulzmann2014} is problematic.
 Their central definition is an ``ordering relation'' defined by the
 rules (slightly adapted to fit our notation):
-\begin{center}
+??
-\begin{tabular}{@ {}c@ {\hspace{4mm}}c@ {}}
-@{thm[mode=Rule] C2[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>1\<iota>" "v\<^sub>2" "r\<^sub>2" "v\<^sub>2\<iota>"]}\,(C2) &
-@{thm[mode=Rule] C1[of "v\<^sub>2" "r\<^sub>2" "v\<^sub>2\<iota>" "v\<^sub>1" "r\<^sub>1"]}\,(C1)\smallskip\\
-@{thm[mode=Rule] A1[of "v\<^sub>1" "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\,(A1) &
-@{thm[mode=Rule] A2[of "v\<^sub>2" "v\<^sub>1" "r\<^sub>1" "r\<^sub>2"]}\,(A2)\smallskip\\
-@{thm[mode=Rule] A3[of "v\<^sub>1" "r\<^sub>2" "v\<^sub>2" "r\<^sub>1"]}\,(A3) &
-@{thm[mode=Rule] A4[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\,(A4)\smallskip\\
-@{thm[mode=Rule] K1[of "v" "vs" "r"]}\,(K1) &
-@{thm[mode=Rule] K2[of "v" "vs" "r"]}\,(K2)\smallskip\\
-@{thm[mode=Rule] K3[of "v\<^sub>1" "r" "v\<^sub>2" "vs\<^sub>1" "vs\<^sub>2"]}\,(K3) &
-@{thm[mode=Rule] K4[of "vs\<^sub>1" "r" "vs\<^sub>2" "v"]}\,(K4)
-\end{tabular}
-\end{center}
 \noindent The idea behind the rules (A1) and (A2), for example, is that a
 @{text Left}-value is bigger than a @{text Right}-value, if the underlying
 string of the @{text Left}-value is longer or of equal length to the
 underlying string of the @{text Right}-value. The order is reversed,
 string is. This seems to be only a very minor difference, but it has
 drastic consequences in terms of what properties both orderings enjoy.
 What is interesting for our purposes is that the properties reflexivity,
 totality and transitivity for this GREEDY ordering can be proved
 relatively easily by induction.
-These properties of GREEDY, however, do not transfer to the POSIX
-``ordering'' by Sulzmann and Lu, which they define as @{text "v\<^sub>1 \<ge>\<^sub>r v\<^sub>2"}.
-To start with, @{text "v\<^sub>1 \<ge>\<^sub>r v\<^sub>2"} is
-not defined inductively, but as $($@{term "v\<^sub>1 = v\<^sub>2"}$)$ $\vee$ $($@{term "(v\<^sub>1 >r
-v\<^sub>2) \<and> (flat v\<^sub>1 = flat (v\<^sub>2::val))"}$)$. This means that @{term "v\<^sub>1
->(r::rexp) (v\<^sub>2::val)"} does not necessarily imply @{term "v\<^sub>1 \<ge>(r::rexp)
-(v\<^sub>2::val)"}. Moreover, transitivity does not hold in the ``usual''
-formulation, for example:
-\begin{falsehood}
-Suppose @{term "\<turnstile> v\<^sub>1 : r"}, @{term "\<turnstile> v\<^sub>2 : r"} and @{term "\<turnstile> v\<^sub>3 : r"}.
-If @{term "v\<^sub>1 >(r::rexp) (v\<^sub>2::val)"} and @{term "v\<^sub>2 >(r::rexp) (v\<^sub>3::val)"}
-then @{term "v\<^sub>1 >(r::rexp) (v\<^sub>3::val)"}.
-\end{falsehood}
-\noindent If formulated in this way, then there are various counter
-examples: For example let @{term r} be @{text "a + ((a + a)\<cdot>(a + \<zero>))"}
-then the @{term "v\<^sub>1"}, @{term "v\<^sub>2"} and @{term "v\<^sub>3"} below are values
-of @{term r}:
-\begin{center}
-\begin{tabular}{lcl}
-@{term "v\<^sub>1"} & $=$ & @{term "Left(Char a)"}\\
-@{term "v\<^sub>2"} & $=$ & @{term "Right(Seq (Left(Char a)) (Right Void))"}\\
-@{term "v\<^sub>3"} & $=$ & @{term "Right(Seq (Right(Char a)) (Left(Char a)))"}
-\end{tabular}
-\end{center}
-\noindent Moreover @{term "v\<^sub>1 >(r::rexp) v\<^sub>2"} and @{term "v\<^sub>2 >(r::rexp)
-v\<^sub>3"}, but \emph{not} @{term "v\<^sub>1 >(r::rexp) v\<^sub>3"}! The reason is that
-although @{term "v\<^sub>3"} is a @{text "Right"}-value, it can match a longer
-string, namely @{term "flat v\<^sub>3 = [a,a]"}, while @{term "flat v\<^sub>1"} (and
-@{term "flat v\<^sub>2"}) matches only @{term "[a]"}. So transitivity in this
-formulation does not hold---in this example actually @{term "v\<^sub>3
->(r::rexp) v\<^sub>1"}!
-Sulzmann and Lu ``fix'' this problem by weakening the transitivity
-property. They require in addition that the underlying strings are of the
-same length. This excludes the counter example above and any
-counter-example we were able to find (by hand and by machine). Thus the
-transitivity lemma should be formulated as:
-\begin{conject}
-Suppose @{term "\<turnstile> v\<^sub>1 : r"}, @{term "\<turnstile> v\<^sub>2 : r"} and @{term "\<turnstile> v\<^sub>3 : r"},
-and also @{text "|v\<^sub>1| = |v\<^sub>2| = |v\<^sub>3|"}.\\
-If @{term "v\<^sub>1 >(r::rexp) (v\<^sub>2::val)"} and @{term "v\<^sub>2 >(r::rexp) (v\<^sub>3::val)"}
-then @{term "v\<^sub>1 >(r::rexp) (v\<^sub>3::val)"}.
-\end{conject}
-\noindent While we agree with Sulzmann and Lu that this property
-probably(!) holds, proving it seems not so straightforward: although one
-begins with the assumption that the values have the same flattening, this
-cannot be maintained as one descends into the induction. This is a problem
-that occurs in a number of places in the proofs by Sulzmann and Lu.
-Although they do not give an explicit proof of the transitivity property,
-they give a closely related property about the existence of maximal
-elements. They state that this can be verified by an induction on @{term r}. We
-disagree with this as we shall show next in case of transitivity. The case
-where the reasoning breaks down is the sequence case, say @{term "SEQ r\<^sub>1 r\<^sub>2"}.
-The induction hypotheses in this case are
-\begin{center}
-\begin{tabular}{@ {}c@ {\hspace{10mm}}c@ {}}
-\begin{tabular}{@ {}l@ {\hspace{-7mm}}l@ {}}
-IH @{term "r\<^sub>1"}:\\
-@{text "\<forall> v\<^sub>1, v\<^sub>2, v\<^sub>3."} \\
-& @{term "\<turnstile> v\<^sub>1 : r\<^sub>1"}\;@{text "\<and>"}
-@{term "\<turnstile> v\<^sub>2 : r\<^sub>1"}\;@{text "\<and>"}
-@{term "\<turnstile> v\<^sub>3 : r\<^sub>1"}\\
-& @{text "\<and>"} @{text "|v\<^sub>1| = |v\<^sub>2| = |v\<^sub>3|"}\\
-& @{text "\<and>"} @{term "v\<^sub>1 >(r\<^sub>1::rexp) v\<^sub>2 \<and> v\<^sub>2 >(r\<^sub>1::rexp) v\<^sub>3"}\medskip\\
-& $\Rightarrow$ @{term "v\<^sub>1 >(r\<^sub>1::rexp) v\<^sub>3"}
-\end{tabular} &
-\begin{tabular}{@ {}l@ {\hspace{-7mm}}l@ {}}
-IH @{term "r\<^sub>2"}:\\
-@{text "\<forall> v\<^sub>1, v\<^sub>2, v\<^sub>3."}\\
-& @{term "\<turnstile> v\<^sub>1 : r\<^sub>2"}\;@{text "\<and>"}
-@{term "\<turnstile> v\<^sub>2 : r\<^sub>2"}\;@{text "\<and>"}
-@{term "\<turnstile> v\<^sub>3 : r\<^sub>2"}\\
-& @{text "\<and>"} @{text "|v\<^sub>1| = |v\<^sub>2| = |v\<^sub>3|"}\\
-& @{text "\<and>"} @{term "v\<^sub>1 >(r\<^sub>2::rexp) v\<^sub>2 \<and> v\<^sub>2 >(r\<^sub>2::rexp) v\<^sub>3"}\medskip\\
-& $\Rightarrow$ @{term "v\<^sub>1 >(r\<^sub>2::rexp) v\<^sub>3"}
-\end{tabular}
-\end{tabular}
-\end{center}
-\noindent We can assume that
-%
-\begin{equation}
-@{term "(Seq (v\<^sub>1\<^sub>l) (v\<^sub>1\<^sub>r)) >(SEQ r\<^sub>1 r\<^sub>2) (Seq (v\<^sub>2\<^sub>l) (v\<^sub>2\<^sub>r))"}
-\qquad\textrm{and}\qquad
-@{term "(Seq (v\<^sub>2\<^sub>l) (v\<^sub>2\<^sub>r)) >(SEQ r\<^sub>1 r\<^sub>2) (Seq (v\<^sub>3\<^sub>l) (v\<^sub>3\<^sub>r))"}
-\label{assms}
-\end{equation}
-\noindent hold, and furthermore that the values have equal length, namely:
-%
-\begin{equation}
-@{term "flat (Seq (v\<^sub>1\<^sub>l) (v\<^sub>1\<^sub>r)) = flat (Seq (v\<^sub>2\<^sub>l) (v\<^sub>2\<^sub>r))"}
-\qquad\textrm{and}\qquad
-@{term "flat (Seq (v\<^sub>2\<^sub>l) (v\<^sub>2\<^sub>r)) = flat (Seq (v\<^sub>3\<^sub>l) (v\<^sub>3\<^sub>r))"}
-\label{lens}
-\end{equation}
-\noindent We need to show that @{term "(Seq (v\<^sub>1\<^sub>l) (v\<^sub>1\<^sub>r)) >(SEQ r\<^sub>1 r\<^sub>2)
-(Seq (v\<^sub>3\<^sub>l) (v\<^sub>3\<^sub>r))"} holds. We can proceed by analysing how the
-assumptions in \eqref{assms} have arisen. There are four cases. Let us
-assume we are in the case where we know
-\[
-@{term "v\<^sub>1\<^sub>l >(r\<^sub>1::rexp) v\<^sub>2\<^sub>l"}
-\qquad\textrm{and}\qquad
-@{term "v\<^sub>2\<^sub>l >(r\<^sub>1::rexp) v\<^sub>3\<^sub>l"}
-\]
-\noindent and also know the corresponding inhabitation judgements. This is
-exactly a case where we would like to apply the induction hypothesis
-IH~$r_1$. But we cannot! We still need to show that @{term "flat (v\<^sub>1\<^sub>l) =
-flat(v\<^sub>2\<^sub>l)"} and @{term "flat(v\<^sub>2\<^sub>l) = flat(v\<^sub>3\<^sub>l)"}. We know from
-\eqref{lens} that the lengths of the sequence values are equal, but from
-this we cannot infer anything about the lengths of the component values.
-Indeed in general they will be unequal, that is
-\[
-@{term "flat(v\<^sub>1\<^sub>l) \<noteq> flat(v\<^sub>2\<^sub>l)"}
-\qquad\textrm{and}\qquad
-@{term "flat(v\<^sub>1\<^sub>r) \<noteq> flat(v\<^sub>2\<^sub>r)"}
-\]
-\noindent but still \eqref{lens} will hold. Now we are stuck, since the IH
-does not apply. As said, this problem where the induction hypothesis does
-not apply arises in several places in the proof of Sulzmann and Lu, not
-just for proving transitivity.
 *}
 section {* Conclusion *}
 text {*

changeset 265	d36be1e356c0
parent 218	16af5b8bd285
child 266	fff2e1b40dfc