thys/Paper/Paper.thy
changeset 172 cdc0bdcfba3f
parent 171 91647a8d84a3
child 173 80fe81a28a52
equal deleted inserted replaced
171:91647a8d84a3 172:cdc0bdcfba3f
    60   ValOrdEq ("_ \<ge>\<^bsub>_\<^esub> _" [77,77,77] 77)
    60   ValOrdEq ("_ \<ge>\<^bsub>_\<^esub> _" [77,77,77] 77)
    61 
    61 
    62 definition 
    62 definition 
    63   "match r s \<equiv> nullable (ders s r)"
    63   "match r s \<equiv> nullable (ders s r)"
    64 
    64 
       
    65 (*
       
    66 comments not implemented
       
    67 
       
    68 p9. "None is returned, indicating an error is raised" : there is no error
       
    69 
       
    70 p9. The condtion "not exists s3 s4..." appears often enough (in particular in
       
    71 the proof of Lemma 3) to warrant a definition.
       
    72 
       
    73 p9. "The POSIX value v" : at this point the existence of a POSIX value is yet
       
    74 to be shown
       
    75 
       
    76 p10. (proof Lemma 3) : separating the cases with description/itemize would greatly
       
    77 improve readability
       
    78 
       
    79 p11. Theorem 2(2) : Stressing the uniqueness is strange, given that it follows
       
    80 trivially from the fact that "lexer" is a function. Maybe state the existence of
       
    81 a unique POSIX value as corollary.
       
    82 
       
    83 p14. Proposition 3 is evidently false and Proposition 4 is a conjecture.
       
    84 I find it confusing that Propositions 1 and 2, which are proven facts, are also called propositions.
       
    85 
       
    86 *)
       
    87 
    65 (*>*)
    88 (*>*)
    66 
    89 
    67 section {* Introduction *}
    90 section {* Introduction *}
    68 
    91 
    69 
    92 
   412   @{term b} and @{term c}). We then use @{const nullable} to find out
   435   @{term b} and @{term c}). We then use @{const nullable} to find out
   413   whether the resulting derivative regular expression @{term "r\<^sub>4"}
   436   whether the resulting derivative regular expression @{term "r\<^sub>4"}
   414   can match the empty string. If yes, we call the function @{const mkeps}
   437   can match the empty string. If yes, we call the function @{const mkeps}
   415   that produces a value @{term "v\<^sub>4"} for how @{term "r\<^sub>4"} can
   438   that produces a value @{term "v\<^sub>4"} for how @{term "r\<^sub>4"} can
   416   match the empty string (taking into account the POSIX constraints in case
   439   match the empty string (taking into account the POSIX constraints in case
   417   there are several ways). This functions is defined by the clauses:
   440   there are several ways). This function is defined by the clauses:
   418 
   441 
   419 \begin{figure}[t]
   442 \begin{figure}[t]
   420 \begin{center}
   443 \begin{center}
   421 \begin{tikzpicture}[scale=2,node distance=1.3cm,
   444 \begin{tikzpicture}[scale=2,node distance=1.3cm,
   422                     every node/.style={minimum size=7mm}]
   445                     every node/.style={minimum size=7mm}]
   517   @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} $\dn$ @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]}
   540   @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} $\dn$ @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]}
   518   \end{center}
   541   \end{center}
   519 
   542 
   520   \noindent Consider first the @{text "else"}-branch where the derivative is @{term
   543   \noindent Consider first the @{text "else"}-branch where the derivative is @{term
   521   "SEQ (der c r\<^sub>1) r\<^sub>2"}. The corresponding value must therefore
   544   "SEQ (der c r\<^sub>1) r\<^sub>2"}. The corresponding value must therefore
   522   be the form @{term "Seq v\<^sub>1 v\<^sub>2"}, which matches the left-hand
   545   be of the form @{term "Seq v\<^sub>1 v\<^sub>2"}, which matches the left-hand
   523   side in clause~(4) of @{term inj}. In the @{text "if"}-branch the derivative is an
   546   side in clause~(4) of @{term inj}. In the @{text "if"}-branch the derivative is an
   524   alternative, namely @{term "ALT (SEQ (der c r\<^sub>1) r\<^sub>2) (der c
   547   alternative, namely @{term "ALT (SEQ (der c r\<^sub>1) r\<^sub>2) (der c
   525   r\<^sub>2)"}. This means we either have to consider a @{text Left}- or
   548   r\<^sub>2)"}. This means we either have to consider a @{text Left}- or
   526   @{text Right}-value. In case of the @{text Left}-value we know further it
   549   @{text Right}-value. In case of the @{text Left}-value we know further it
   527   must be a value for a sequence regular expression. Therefore the pattern
   550   must be a value for a sequence regular expression. Therefore the pattern
   557   \end{tabular}
   580   \end{tabular}
   558   \end{lemma}
   581   \end{lemma}
   559 
   582 
   560   \begin{proof}
   583   \begin{proof}
   561   Both properties are by routine inductions: the first one can, for example,
   584   Both properties are by routine inductions: the first one can, for example,
   562   be proved by an induction over the definition of @{term derivatives}; the second by
   585   be proved by induction over the definition of @{term derivatives}; the second by
   563   an induction on @{term r}. There are no interesting cases.\qed
   586   an induction on @{term r}. There are no interesting cases.\qed
   564   \end{proof}
   587   \end{proof}
   565 
   588 
   566   Having defined the @{const mkeps} and @{text inj} function we can extend
   589   Having defined the @{const mkeps} and @{text inj} function we can extend
   567   \Brz's matcher so that a [lexical] value is constructed (assuming the
   590   \Brz's matcher so that a [lexical] value is constructed (assuming the
   682   \noindent
   705   \noindent
   683   The central lemma for our POSIX relation is that the @{text inj}-function
   706   The central lemma for our POSIX relation is that the @{text inj}-function
   684   preserves POSIX values.
   707   preserves POSIX values.
   685 
   708 
   686   \begin{lemma}\label{Posix2}
   709   \begin{lemma}\label{Posix2}
   687   @{thm[mode=IfThen] Posix2_roy_version}
   710   @{thm[mode=IfThen] Posix_injval}
   688   \end{lemma}
   711   \end{lemma}
   689 
   712 
   690   \begin{proof}
   713   \begin{proof}
   691 
   714 
   692   By induction on @{text r}. Suppose @{term "r = ALT r\<^sub>1 r\<^sub>2"}. There are
   715   By induction on @{text r}. Suppose @{term "r = ALT r\<^sub>1 r\<^sub>2"}. There are
   720   \[@{term "\<not> (\<exists>s\<^sub>3 s\<^sub>4. s\<^sub>3 \<noteq> [] \<and> s\<^sub>3 @ s\<^sub>4 = s\<^sub>2 \<and> (c # s\<^sub>1) @ s\<^sub>3 \<in> L r\<^sub>1 \<and> s\<^sub>4 \<in> L r\<^sub>2)"}\]
   743   \[@{term "\<not> (\<exists>s\<^sub>3 s\<^sub>4. s\<^sub>3 \<noteq> [] \<and> s\<^sub>3 @ s\<^sub>4 = s\<^sub>2 \<and> (c # s\<^sub>1) @ s\<^sub>3 \<in> L r\<^sub>1 \<and> s\<^sub>4 \<in> L r\<^sub>2)"}\]
   721 
   744 
   722   \noindent We can use the induction hypothesis for @{text "r\<^sub>1"} to obtain
   745   \noindent We can use the induction hypothesis for @{text "r\<^sub>1"} to obtain
   723   @{term "(c # s\<^sub>1) \<in> r\<^sub>1 \<rightarrow> injval r\<^sub>1 c v\<^sub>1"}. Putting this all together allows us to infer
   746   @{term "(c # s\<^sub>1) \<in> r\<^sub>1 \<rightarrow> injval r\<^sub>1 c v\<^sub>1"}. Putting this all together allows us to infer
   724   @{term "((c # s\<^sub>1) @ s\<^sub>2) \<in> SEQ r\<^sub>1 r\<^sub>2 \<rightarrow> Seq (injval r\<^sub>1 c v\<^sub>1) v\<^sub>2"}. The case @{text "(c)"}
   747   @{term "((c # s\<^sub>1) @ s\<^sub>2) \<in> SEQ r\<^sub>1 r\<^sub>2 \<rightarrow> Seq (injval r\<^sub>1 c v\<^sub>1) v\<^sub>2"}. The case @{text "(c)"}
   725   is similarly.
   748   is similar.
   726 
   749 
   727   For @{text "(b)"} we know @{term "s \<in> der c r\<^sub>2 \<rightarrow> v\<^sub>1"} and 
   750   For @{text "(b)"} we know @{term "s \<in> der c r\<^sub>2 \<rightarrow> v\<^sub>1"} and 
   728   @{term "s\<^sub>1 @ s\<^sub>2 \<notin> L (SEQ (der c r\<^sub>1) r\<^sub>2)"}. From the former
   751   @{term "s\<^sub>1 @ s\<^sub>2 \<notin> L (SEQ (der c r\<^sub>1) r\<^sub>2)"}. From the former
   729   we have @{term "(c # s) \<in> r\<^sub>2 \<rightarrow> (injval r\<^sub>2 c v\<^sub>1)"} by induction hypothesis
   752   we have @{term "(c # s) \<in> r\<^sub>2 \<rightarrow> (injval r\<^sub>2 c v\<^sub>1)"} by induction hypothesis
   730   for @{term "r\<^sub>2"}. From the latter we can infer
   753   for @{term "r\<^sub>2"}. From the latter we can infer
   758   \begin{proof}
   781   \begin{proof}
   759   By induction on @{term s} using Lem.~\ref{lemmkeps} and \ref{Posix2}.\qed  
   782   By induction on @{term s} using Lem.~\ref{lemmkeps} and \ref{Posix2}.\qed  
   760   \end{proof}
   783   \end{proof}
   761 
   784 
   762   \noindent This concludes our correctness proof. Note that we have not
   785   \noindent This concludes our correctness proof. Note that we have not
   763   changed the algorithm of Sulzmann and Lu,\footnote{Any deviation we
   786   changed the algorithm of Sulzmann and Lu,\footnote{All deviations we
   764   introduced is harmless.} but introduced our own specification for what a
   787   introduced are harmless.} but introduced our own specification for what a
   765   correct result---a POSIX value---should be. A strong point in favour of
   788   correct result---a POSIX value---should be. A strong point in favour of
   766   Sulzmann and Lu's algorithm is that it can be extended in various ways.
   789   Sulzmann and Lu's algorithm is that it can be extended in various ways.
   767 
   790 
   768 *}
   791 *}
   769 
   792 
   889                      & & $|$ @{term "Some v"} @{text "\<Rightarrow>"} @{text "Some (inj r c (f\<^sub>r v))"}                          
   912                      & & $|$ @{term "Some v"} @{text "\<Rightarrow>"} @{text "Some (inj r c (f\<^sub>r v))"}                          
   890   \end{tabular}
   913   \end{tabular}
   891   \end{center}
   914   \end{center}
   892 
   915 
   893   \noindent
   916   \noindent
   894   In the second clause we first calculate the derivative @{text "r \\ c"}
   917   In the second clause we first calculate the derivative @{term "der c r"}
   895   and then simplify the result. This gives us a simplified derivative
   918   and then simplify the result. This gives us a simplified derivative
   896   @{text "r\<^sub>s"} and a rectification function @{text "f\<^sub>r"}. The lexer
   919   @{text "r\<^sub>s"} and a rectification function @{text "f\<^sub>r"}. The lexer
   897   is then recursively called with the simplified derivative, but before
   920   is then recursively called with the simplified derivative, but before
   898   we inject the character @{term c} into value, we need to rectify
   921   we inject the character @{term c} into the value, we need to rectify
   899   it (that is construct @{term "f\<^sub>r v"}). We can prove that
   922   it (that is construct @{term "f\<^sub>r v"}). We can prove that
   900 
   923 
   901   \begin{lemma}
   924   \begin{theorem}
   902   @{thm slexer_correctness}
   925   @{thm slexer_correctness}
   903   \end{lemma}
   926   \end{theorem}
   904 
   927 
   905   \noindent
   928   \noindent
   906   holds but for lack of space refer the reader to our mechanisation for details.
   929   holds but for lack of space refer the reader to our mechanisation for details.
   907 
   930 
   908 *}
   931 *}
   909 
   932 
   910 section {* The Correctness Argument by Sulzmmann and Lu\label{argu} *}
   933 section {* The Correctness Argument by Sulzmann and Lu\label{argu} *}
   911 
   934 
   912 text {*
   935 text {*
   913 %  \newcommand{\greedy}{\succcurlyeq_{gr}}
   936 %  \newcommand{\greedy}{\succcurlyeq_{gr}}
   914  \newcommand{\posix}{>}
   937  \newcommand{\posix}{>}
   915 
   938