lexing: comparison thys/Paper/Paper.thy

equal deleted inserted replaced

-:85766e408c66
+:42ffaca7c85e
 F_SEQ1 ("F\<^bsub>Seq1\<^esub> _ _") and
 F_SEQ2 ("F\<^bsub>Seq2\<^esub> _ _") and
 F_SEQ ("F\<^bsub>Seq\<^esub> _ _") and
 simp_SEQ ("simp\<^bsub>Seq\<^esub> _ _" [1000, 1000] 1) and
 simp_ALT ("simp\<^bsub>Alt\<^esub> _ _" [1000, 1000] 1) and
-slexer ("lexer\<^sup>+ _ _" [78,78] 77) and
+slexer ("lexer\<^sup>+" 1000) and
 ValOrd ("_ >\<^bsub>_\<^esub> _" [77,77,77] 77) and
 ValOrdEq ("_ \<ge>\<^bsub>_\<^esub> _" [77,77,77] 77)
 definition
 \noindent
 We can prove that given a string @{term s} and regular expression @{term
 r}, the POSIX value @{term v} is uniquely determined by @{term "s \<in> r \<rightarrow> v"}.
-\begin{theorem}
+\begin{theorem}\label{posixdeterm}
 @{thm[mode=IfThen] Posix_determ(1)[of _ _ "v\<^sub>1" "v\<^sub>2"]}
 \end{theorem}
 \begin{proof} By induction on the definition of @{term "s \<in> r \<rightarrow> v\<^sub>1"} and
 a case analysis of @{term "s \<in> r \<rightarrow> v\<^sub>2"}. This proof requires the
 With Lem.~\ref{Posix2} in place, it is completely routine to establish
 that the Sulzmann and Lu lexer satisfies our specification (returning
 the null value @{term "None"} iff the string is not in the language of the regular expression,
 and returning a unique POSIX value iff the string \emph{is} in the language):
-\begin{theorem}\mbox{}\smallskip\\
+\begin{theorem}\mbox{}\smallskip\\\label{lexercorrect}
 \begin{tabular}{ll}
 (1) & @{thm (lhs) lexer_correct_None} if and only if @{thm (rhs) lexer_correct_None}\\
 (2) & @{thm (lhs) lexer_correct_Some} if and only if @{thm (rhs) lexer_correct_Some}\\
 \end{tabular}
 \end{theorem}
 and then simplify the result. This gives us a simplified derivative
 @{text "r\<^sub>s"} and a rectification function @{text "f\<^sub>r"}. The lexer
 is then recursively called with the simplified derivative, but before
 we inject the character @{term c} into the value @{term v}, we need to rectify
 @{term v} (that is construct @{term "f\<^sub>r v"}). Before we can establish the correctness
-of @{const "slexer"}, we need to show that simplification preserves the language
+of @{term "slexer"}, we need to show that simplification preserves the language
-and the relation between simplification and our posix definition:
+and simplification preserves our POSIX relation once the value is rectified
+(recall @{const "simp"} generates a regular expression, rectification function pair):
-\begin{lemma}\mbox{}\smallskip\\
+\begin{lemma}\mbox{}\smallskip\\\label{slexeraux}
 \begin{tabular}{ll}
 (1) & @{thm L_fst_simp[symmetric]}\\
 (2) & @{thm[mode=IfThen] Posix_simp}
 \end{tabular}
 \end{lemma}
-\noindent
+\begin{proof}
-We can now prove that
+Both are by induction on @{text r}. There is no interesting case for the
+first statement. For the second statement of interest are the @{term "r = SEQ r\<^sub>1 r\<^sub>2"}
+and @{term "r = ALT r\<^sub>1 r\<^sub>2"} cases.
+\end{proof}
+\noindent We can now prove relatively straightforwardly that the
+optimised lexer produce the expected result:
 \begin{theorem}
 @{thm slexer_correctness}
 \end{theorem}
-\noindent
+\begin{proof} By induction on @{term s} generalising over @{term
-holds but for lack of space refer the reader to our mechanisation for details.
+r}. The case @{term "[]"} is trivial. For the cons-case suppose the
+string is of the form @{term "c # s"}. By induction hypothesis we
+know @{term "slexer r s = lexer r s"} holds for all @{term r} (in
+particular for @{term "r"} being the derivative @{term "der c
+r"}). Let @{term "r\<^sub>s"} be the simplified derivative regular expression, @{term
+"fst (simp (der c r))"}, and @{term "f\<^sub>r"} be the rectification
+function, @{term "snd (simp (der c r))"}.  We distinguish the cases
+whether (*) @{term "s \<in> L (der c r)"} or not. In the first case we
+have by Thm~\ref{lexercorrect}(2) a value @{term "v"} so that @{term
+"lexer (der c r) s = Some v"} and @{term "s \<in> der c r \<rightarrow> v"} hold.
+By Lem~\ref{slexeraux}(1) we can also infer from~(*) that @{term "s
+\<in> L r\<^sub>s"} holds.  Hence we know by Thm~\ref{lexercorrect}(2) that
+there exists a @{term "v'"} with @{term "lexer r\<^sub>s s = Some v'"} and
+@{term "s \<in> r\<^sub>s \<rightarrow> v'"}. From the latter we know by
+Lem~\ref{slexeraux}(2) that @{term "s \<in> der c r \<rightarrow> (f\<^sub>r v')"} holds.
+By the uniqueness of the POSIX relation (Thm~\ref{posixdeterm}) we
+can infer that @{term v} is equal to @{term "f\<^sub>r v'"}---that is the
+rectification function applied to @{term "v'"}
+produces the original @{term "v"}.  Now the case follows by the
+definitions of @{const lexer} and @{const slexer}.
+In the second case where @{term "s \<notin> L (der c r)"} we have that
+@{term "lexer (der c r) s = None"} by Thm~\ref{lexercorrect}(1).  We
+also know by Lem~\ref{slexeraux}(1) that @{term "s \<notin> L r\<^sub>s"}. Hence
+@{term "lexer r\<^sub>s s = None"} by Thm~\ref{lexercorrect}(1) and
+by IH then also @{term "slexer r\<^sub>s s = None"}. With this we can
+conclude in this case too.\qed
+\end{proof}
 *}
 section {* The Correctness Argument by Sulzmann and Lu\label{argu} *}
 text {*

changeset 180	42ffaca7c85e
parent 179	85766e408c66
child 181	162f112b814b