95 expression @{term r} and character @{term c}, one has @{term "cs \<in> L(r)"} if |
95 expression @{term r} and character @{term c}, one has @{term "cs \<in> L(r)"} if |
96 and only if \mbox{@{term "s \<in> L(der c r)"}}. The beauty of Brzozowski's |
96 and only if \mbox{@{term "s \<in> L(der c r)"}}. The beauty of Brzozowski's |
97 derivatives is that they are neatly expressible in any functional language, |
97 derivatives is that they are neatly expressible in any functional language, |
98 and easily definable and reasoned about in theorem provers---the definitions |
98 and easily definable and reasoned about in theorem provers---the definitions |
99 just consist of inductive datatypes and simple recursive functions. A |
99 just consist of inductive datatypes and simple recursive functions. A |
100 completely formalised correctness proof of this matcher in for example HOL4 |
100 mechanised correctness proof of Brzozowski's matcher in for example HOL4 |
101 has been mentioned by Owens and Slind~\cite{Owens2008}. Another one in Isabelle/HOL is part |
101 has been mentioned by Owens and Slind~\cite{Owens2008}. Another one in Isabelle/HOL is part |
102 of the work by Krauss and Nipkow \cite{Krauss2011}. And another one in Coq is given |
102 of the work by Krauss and Nipkow \cite{Krauss2011}. And another one in Coq is given |
103 by Coquand and Siles \cite{Coquand2012}. |
103 by Coquand and Siles \cite{Coquand2012}. |
104 |
104 |
105 If a regular expression matches a string, then in general there is more than |
105 If a regular expression matches a string, then in general there is more than |
133 \item[$\bullet$] \emph{Priority Rule:} |
133 \item[$\bullet$] \emph{Priority Rule:} |
134 For a particular longest initial substring, the first regular expression |
134 For a particular longest initial substring, the first regular expression |
135 that can match determines the token. |
135 that can match determines the token. |
136 \end{itemize} |
136 \end{itemize} |
137 |
137 |
138 \noindent Consider for example @{text "r\<^bsub>key\<^esub>"} recognising keywords |
138 \noindent Consider for example a regular expression @{text "r\<^bsub>key\<^esub>"} for recognising keywords |
139 such as @{text "if"}, @{text "then"} and so on; and @{text "r\<^bsub>id\<^esub>"} |
139 such as @{text "if"}, @{text "then"} and so on; and @{text "r\<^bsub>id\<^esub>"} |
140 recognising identifiers (say, a single character followed by |
140 recognising identifiers (say, a single character followed by |
141 characters or numbers). Then we can form the regular expression |
141 characters or numbers). Then we can form the regular expression |
142 @{text "(r\<^bsub>key\<^esub> + r\<^bsub>id\<^esub>)\<^sup>\<star>"} and use POSIX matching to tokenise strings, |
142 @{text "(r\<^bsub>key\<^esub> + r\<^bsub>id\<^esub>)\<^sup>\<star>"} and use POSIX matching to tokenise strings, |
143 say @{text "iffoo"} and @{text "if"}. For @{text "iffoo"} we obtain |
143 say @{text "iffoo"} and @{text "if"}. For @{text "iffoo"} we obtain |
267 the star of a language and @{text "(ii)"} if @{term "s\<^sub>1"} is in a |
267 the star of a language and @{text "(ii)"} if @{term "s\<^sub>1"} is in a |
268 language and @{term "s\<^sub>2"} in the star of this language, then also @{term |
268 language and @{term "s\<^sub>2"} in the star of this language, then also @{term |
269 "s\<^sub>1 @ s\<^sub>2"} is in the star of this language. It will also be convenient |
269 "s\<^sub>1 @ s\<^sub>2"} is in the star of this language. It will also be convenient |
270 to use the following notion of a \emph{semantic derivative} (or \emph{left |
270 to use the following notion of a \emph{semantic derivative} (or \emph{left |
271 quotient}) of a language defined as |
271 quotient}) of a language defined as |
272 @{thm (lhs) Der_def} $\dn$ @{thm (rhs) Der_def}. |
272 % |
|
273 \begin{center} |
|
274 @{thm Der_def}\;. |
|
275 \end{center} |
|
276 |
|
277 \noindent |
273 For semantic derivatives we have the following equations (for example |
278 For semantic derivatives we have the following equations (for example |
274 mechanically proved in \cite{Krauss2011}): |
279 mechanically proved in \cite{Krauss2011}): |
275 |
280 % |
276 \begin{equation}\label{SemDer} |
281 \begin{equation}\label{SemDer} |
277 \begin{array}{lcl} |
282 \begin{array}{lcl} |
278 @{thm (lhs) Der_null} & \dn & @{thm (rhs) Der_null}\\ |
283 @{thm (lhs) Der_null} & \dn & @{thm (rhs) Der_null}\\ |
279 @{thm (lhs) Der_empty} & \dn & @{thm (rhs) Der_empty}\\ |
284 @{thm (lhs) Der_empty} & \dn & @{thm (rhs) Der_empty}\\ |
280 @{thm (lhs) Der_char} & \dn & @{thm (rhs) Der_char}\\ |
285 @{thm (lhs) Der_char} & \dn & @{thm (rhs) Der_char}\\ |
403 |
405 |
404 \begin{center} |
406 \begin{center} |
405 \begin{tabular}{c} |
407 \begin{tabular}{c} |
406 \\[-8mm] |
408 \\[-8mm] |
407 @{thm[mode=Axiom] Prf.intros(4)} \qquad |
409 @{thm[mode=Axiom] Prf.intros(4)} \qquad |
408 @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\\[3mm] |
410 @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\\[4mm] |
409 @{thm[mode=Rule] Prf.intros(2)[of "v\<^sub>1" "r\<^sub>1" "r\<^sub>2"]} \qquad |
411 @{thm[mode=Rule] Prf.intros(2)[of "v\<^sub>1" "r\<^sub>1" "r\<^sub>2"]} \qquad |
410 @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\\[3mm] |
412 @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\\[4mm] |
411 @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\\[3mm] |
413 @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\\[4mm] |
412 @{thm[mode=Axiom] Prf.intros(6)[of "r"]} \qquad |
414 @{thm[mode=Axiom] Prf.intros(6)[of "r"]} \qquad |
413 @{thm[mode=Rule] Prf.intros(7)[of "v" "r" "vs"]} |
415 @{thm[mode=Rule] Prf.intros(7)[of "v" "r" "vs"]} |
414 \end{tabular} |
416 \end{tabular} |
415 \end{center} |
417 \end{center} |
416 |
418 |
470 \caption{The two phases of the algorithm by Sulzmann \& Lu \cite{Sulzmann2014}, |
472 \caption{The two phases of the algorithm by Sulzmann \& Lu \cite{Sulzmann2014}, |
471 matching the string @{term "[a,b,c]"}. The first phase (the arrows from |
473 matching the string @{term "[a,b,c]"}. The first phase (the arrows from |
472 left to right) is \Brz's matcher building successive derivatives. If the |
474 left to right) is \Brz's matcher building successive derivatives. If the |
473 last regular expression is @{term nullable}, then the functions of the |
475 last regular expression is @{term nullable}, then the functions of the |
474 second phase are called (the top-down and right-to-left arrows): first |
476 second phase are called (the top-down and right-to-left arrows): first |
475 @{term mkeps} calculates a value witnessing |
477 @{term mkeps} calculates a value @{term "v\<^sub>4"} witnessing |
476 how the empty string has been recognised by @{term "r\<^sub>4"}. After |
478 how the empty string has been recognised by @{term "r\<^sub>4"}. After |
477 that the function @{term inj} ``injects back'' the characters of the string into |
479 that the function @{term inj} ``injects back'' the characters of the string into |
478 the values. |
480 the values. |
479 \label{Sulz}} |
481 \label{Sulz}} |
480 \end{figure} |
482 \end{figure} |
500 string. |
502 string. |
501 |
503 |
502 The most interesting idea from Sulzmann and Lu \cite{Sulzmann2014} is |
504 The most interesting idea from Sulzmann and Lu \cite{Sulzmann2014} is |
503 the construction of a value for how @{term "r\<^sub>1"} can match the |
505 the construction of a value for how @{term "r\<^sub>1"} can match the |
504 string @{term "[a,b,c]"} from the value how the last derivative, @{term |
506 string @{term "[a,b,c]"} from the value how the last derivative, @{term |
505 "r\<^sub>4"} in Fig~\ref{Sulz}, can match the empty string. Sulzmann and |
507 "r\<^sub>4"} in Fig.~\ref{Sulz}, can match the empty string. Sulzmann and |
506 Lu achieve this by stepwise ``injecting back'' the characters into the |
508 Lu achieve this by stepwise ``injecting back'' the characters into the |
507 values thus inverting the operation of building derivatives, but on the level |
509 values thus inverting the operation of building derivatives, but on the level |
508 of values. The corresponding function, called @{term inj}, takes three |
510 of values. The corresponding function, called @{term inj}, takes three |
509 arguments, a regular expression, a character and a value. For example in |
511 arguments, a regular expression, a character and a value. For example in |
510 the first (or right-most) @{term inj}-step in Fig~\ref{Sulz} the regular |
512 the first (or right-most) @{term inj}-step in Fig.~\ref{Sulz} the regular |
511 expression @{term "r\<^sub>3"}, the character @{term c} from the last |
513 expression @{term "r\<^sub>3"}, the character @{term c} from the last |
512 derivative step and @{term "v\<^sub>4"}, which is the value corresponding |
514 derivative step and @{term "v\<^sub>4"}, which is the value corresponding |
513 to the derivative regular expression @{term "r\<^sub>4"}. The result is |
515 to the derivative regular expression @{term "r\<^sub>4"}. The result is |
514 the new value @{term "v\<^sub>3"}. The final result of the algorithm is |
516 the new value @{term "v\<^sub>3"}. The final result of the algorithm is |
515 the value @{term "v\<^sub>1"}. The @{term inj} function is defined by recursion on regular |
517 the value @{term "v\<^sub>1"}. The @{term inj} function is defined by recursion on regular |
536 |
538 |
537 \noindent To better understand what is going on in this definition it |
539 \noindent To better understand what is going on in this definition it |
538 might be instructive to look first at the three sequence cases (clauses |
540 might be instructive to look first at the three sequence cases (clauses |
539 (4)--(6)). In each case we need to construct an ``injected value'' for |
541 (4)--(6)). In each case we need to construct an ``injected value'' for |
540 @{term "SEQ r\<^sub>1 r\<^sub>2"}. This must be a value of the form @{term |
542 @{term "SEQ r\<^sub>1 r\<^sub>2"}. This must be a value of the form @{term |
541 "Seq DUMMY DUMMY"}. Recall the clause of the @{text derivative}-function |
543 "Seq DUMMY DUMMY"}\,. Recall the clause of the @{text derivative}-function |
542 for sequence regular expressions: |
544 for sequence regular expressions: |
543 |
545 |
544 \begin{center} |
546 \begin{center} |
545 @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} $\dn$ @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} |
547 @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} $\dn$ @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} |
546 \end{center} |
548 \end{center} |
672 is a @{text Right}-value (see the side-condition in @{text "P+R"}). |
674 is a @{text Right}-value (see the side-condition in @{text "P+R"}). |
673 Interesting is also the rule for sequence regular expressions (@{text |
675 Interesting is also the rule for sequence regular expressions (@{text |
674 "PS"}). The first two premises state that @{term "v\<^sub>1"} and @{term "v\<^sub>2"} |
676 "PS"}). The first two premises state that @{term "v\<^sub>1"} and @{term "v\<^sub>2"} |
675 are the POSIX values for @{term "(s\<^sub>1, r\<^sub>1)"} and @{term "(s\<^sub>2, r\<^sub>2)"} |
677 are the POSIX values for @{term "(s\<^sub>1, r\<^sub>1)"} and @{term "(s\<^sub>2, r\<^sub>2)"} |
676 respectively. Consider now the third premise and note that the POSIX value |
678 respectively. Consider now the third premise and note that the POSIX value |
677 of this rule should match the string @{term "s\<^sub>1 @ s\<^sub>2"}. According to the |
679 of this rule should match the string \mbox{@{term "s\<^sub>1 @ s\<^sub>2"}}. According to the |
678 longest match rule, we want that the @{term "s\<^sub>1"} is the longest initial |
680 longest match rule, we want that the @{term "s\<^sub>1"} is the longest initial |
679 split of \mbox{@{term "s\<^sub>1 @ s\<^sub>2"}} such that @{term "s\<^sub>2"} is still recognised |
681 split of \mbox{@{term "s\<^sub>1 @ s\<^sub>2"}} such that @{term "s\<^sub>2"} is still recognised |
680 by @{term "r\<^sub>2"}. Let us assume, contrary to the third premise, that there |
682 by @{term "r\<^sub>2"}. Let us assume, contrary to the third premise, that there |
681 \emph{exist} an @{term "s\<^sub>3"} and @{term "s\<^sub>4"} such that @{term "s\<^sub>2"} |
683 \emph{exist} an @{term "s\<^sub>3"} and @{term "s\<^sub>4"} such that @{term "s\<^sub>2"} |
682 can be split up into a non-empty string @{term "s\<^sub>3"} and a possibly empty |
684 can be split up into a non-empty string @{term "s\<^sub>3"} and a possibly empty |
683 string @{term "s\<^sub>4"}. Moreover the longer string @{term "s\<^sub>1 @ s\<^sub>3"} can be |
685 string @{term "s\<^sub>4"}. Moreover the longer string @{term "s\<^sub>1 @ s\<^sub>3"} can be |
684 matched by @{text "r\<^sub>1"} and the shorter @{term "s\<^sub>4"} can still be |
686 matched by @{text "r\<^sub>1"} and the shorter @{term "s\<^sub>4"} can still be |
685 matched by @{term "r\<^sub>2"}. In this case @{term "s\<^sub>1"} would \emph{not} be the |
687 matched by @{term "r\<^sub>2"}. In this case @{term "s\<^sub>1"} would \emph{not} be the |
686 longest initial split of \mbox{@{term "s\<^sub>1 @ s\<^sub>2"}} and therefore @{term "Seq v\<^sub>1 |
688 longest initial split of \mbox{@{term "s\<^sub>1 @ s\<^sub>2"}} and therefore @{term "Seq v\<^sub>1 |
687 v\<^sub>2"} cannot be a POSIX value for @{term "(s\<^sub>1 @ s\<^sub>2, SEQ r\<^sub>1 r\<^sub>2)"}. |
689 v\<^sub>2"} cannot be a POSIX value for @{term "(s\<^sub>1 @ s\<^sub>2, SEQ r\<^sub>1 r\<^sub>2)"}. |
688 The main point is that this side-condition ensures the longest |
690 The main point is that our side-condition ensures the longest |
689 match rule is satisfied. |
691 match rule is satisfied. |
690 |
692 |
691 A similar condition is imposed on the POSIX value in the @{text |
693 A similar condition is imposed on the POSIX value in the @{text |
692 "P\<star>"}-rule. Also there we want that @{term "s\<^sub>1"} is the longest initial |
694 "P\<star>"}-rule. Also there we want that @{term "s\<^sub>1"} is the longest initial |
693 split of @{term "s\<^sub>1 @ s\<^sub>2"} and furthermore the corresponding value |
695 split of @{term "s\<^sub>1 @ s\<^sub>2"} and furthermore the corresponding value |
714 \begin{lemma}\label{Posix2} |
716 \begin{lemma}\label{Posix2} |
715 @{thm[mode=IfThen] Posix_injval} |
717 @{thm[mode=IfThen] Posix_injval} |
716 \end{lemma} |
718 \end{lemma} |
717 |
719 |
718 \begin{proof} |
720 \begin{proof} |
719 |
721 By induction on @{text r}. We explain two cases. |
720 By induction on @{text r}. Suppose @{term "r = ALT r\<^sub>1 r\<^sub>2"}. There are |
722 |
|
723 \begin{itemize} |
|
724 \item[$\bullet$] Case @{term "r = ALT r\<^sub>1 r\<^sub>2"}. There are |
721 two subcases, namely @{text "(a)"} \mbox{@{term "v = Left v'"}} and @{term |
725 two subcases, namely @{text "(a)"} \mbox{@{term "v = Left v'"}} and @{term |
722 "s \<in> der c r\<^sub>1 \<rightarrow> v'"}; and @{text "(b)"} @{term "v = Right v'"}, @{term |
726 "s \<in> der c r\<^sub>1 \<rightarrow> v'"}; and @{text "(b)"} @{term "v = Right v'"}, @{term |
723 "s \<notin> L (der c r\<^sub>1)"} and @{term "s \<in> der c r\<^sub>2 \<rightarrow> v'"}. In @{text "(a)"} we |
727 "s \<notin> L (der c r\<^sub>1)"} and @{term "s \<in> der c r\<^sub>2 \<rightarrow> v'"}. In @{text "(a)"} we |
724 know @{term "s \<in> der c r\<^sub>1 \<rightarrow> v'"}, from which we can infer @{term "(c # s) |
728 know @{term "s \<in> der c r\<^sub>1 \<rightarrow> v'"}, from which we can infer @{term "(c # s) |
725 \<in> r\<^sub>1 \<rightarrow> injval r\<^sub>1 c v'"} by induction hypothesis and hence @{term "(c # |
729 \<in> r\<^sub>1 \<rightarrow> injval r\<^sub>1 c v'"} by induction hypothesis and hence @{term "(c # |
726 s) \<in> ALT r\<^sub>1 r\<^sub>2 \<rightarrow> injval (ALT r\<^sub>1 r\<^sub>2) c (Left v')"} as needed. Similarly |
730 s) \<in> ALT r\<^sub>1 r\<^sub>2 \<rightarrow> injval (ALT r\<^sub>1 r\<^sub>2) c (Left v')"} as needed. Similarly |
727 in subcase @{text "(b)"} where, however, in addition we have to use |
731 in subcase @{text "(b)"} where, however, in addition we have to use |
728 Prop.~\ref{derprop}(2) in order to infer @{term "c # s \<notin> L r\<^sub>1"} from @{term |
732 Prop.~\ref{derprop}(2) in order to infer @{term "c # s \<notin> L r\<^sub>1"} from @{term |
729 "s \<notin> L (der c r\<^sub>1)"}. |
733 "s \<notin> L (der c r\<^sub>1)"}. |
730 |
734 |
731 Suppose @{term "r = SEQ r\<^sub>1 r\<^sub>2"}. There are three subcases: |
735 \item[$\bullet$] Case @{term "r = SEQ r\<^sub>1 r\<^sub>2"}. There are three subcases: |
732 |
736 |
733 \begin{quote} |
737 \begin{quote} |
734 \begin{description} |
738 \begin{description} |
735 \item[@{text "(a)"}] @{term "v = Left (Seq v\<^sub>1 v\<^sub>2)"} and @{term "nullable r\<^sub>1"} |
739 \item[@{text "(a)"}] @{term "v = Left (Seq v\<^sub>1 v\<^sub>2)"} and @{term "nullable r\<^sub>1"} |
736 \item[@{text "(b)"}] @{term "v = Right v\<^sub>1"} and @{term "nullable r\<^sub>1"} |
740 \item[@{text "(b)"}] @{term "v = Right v\<^sub>1"} and @{term "nullable r\<^sub>1"} |
766 Finally suppose @{term "r = STAR r\<^sub>1"}. This case is very similar to the |
770 Finally suppose @{term "r = STAR r\<^sub>1"}. This case is very similar to the |
767 sequence case, except that we need to also ensure that @{term "flat (injval r\<^sub>1 |
771 sequence case, except that we need to also ensure that @{term "flat (injval r\<^sub>1 |
768 c v\<^sub>1) \<noteq> []"}. This follows from @{term "(c # s\<^sub>1) |
772 c v\<^sub>1) \<noteq> []"}. This follows from @{term "(c # s\<^sub>1) |
769 \<in> r\<^sub>1 \<rightarrow> injval r\<^sub>1 c v\<^sub>1"} (which in turn follows from @{term "s\<^sub>1 \<in> der c |
773 \<in> r\<^sub>1 \<rightarrow> injval r\<^sub>1 c v\<^sub>1"} (which in turn follows from @{term "s\<^sub>1 \<in> der c |
770 r\<^sub>1 \<rightarrow> v\<^sub>1"} and the induction hypothesis).\qed |
774 r\<^sub>1 \<rightarrow> v\<^sub>1"} and the induction hypothesis).\qed |
|
775 \end{itemize} |
771 \end{proof} |
776 \end{proof} |
772 |
777 |
773 \noindent |
778 \noindent |
774 With Lem.~\ref{Posix2} in place, it is completely routine to establish |
779 With Lem.~\ref{Posix2} in place, it is completely routine to establish |
775 that the Sulzmann and Lu lexer satisfies our specification (returning |
780 that the Sulzmann and Lu lexer satisfies our specification (returning |
785 |
790 |
786 \begin{proof} |
791 \begin{proof} |
787 By induction on @{term s} using Lem.~\ref{lemmkeps} and \ref{Posix2}.\qed |
792 By induction on @{term s} using Lem.~\ref{lemmkeps} and \ref{Posix2}.\qed |
788 \end{proof} |
793 \end{proof} |
789 |
794 |
790 \noindent This concludes our correctness proof. Note that we have not |
795 \noindent In (2) we further know by Thm.~\ref{posixdeterm} that the |
791 changed the algorithm of Sulzmann and Lu,\footnote{All deviations we |
796 value returned by the lexer must be unique. This concludes our |
792 introduced are harmless.} but introduced our own specification for what a |
797 correctness proof. Note that we have not changed the algorithm of |
793 correct result---a POSIX value---should be. A strong point in favour of |
798 Sulzmann and Lu,\footnote{All deviations we introduced are |
794 Sulzmann and Lu's algorithm is that it can be extended in various ways. |
799 harmless.} but introduced our own specification for what a correct |
|
800 result---a POSIX value---should be. A strong point in favour of |
|
801 Sulzmann and Lu's algorithm is that it can be extended in various |
|
802 ways. |
795 |
803 |
796 *} |
804 *} |
797 |
805 |
798 section {* Extensions and Optimisations*} |
806 section {* Extensions and Optimisations*} |
799 |
807 |
841 of @{term "ALT ZERO r"}, @{term "ALT r ZERO"}, @{term "SEQ ONE r"} and |
849 of @{term "ALT ZERO r"}, @{term "ALT r ZERO"}, @{term "SEQ ONE r"} and |
842 @{term "SEQ r ONE"} to @{term r}. These simplifications can speed up the |
850 @{term "SEQ r ONE"} to @{term r}. These simplifications can speed up the |
843 algorithms considerably, as noted in \cite{Sulzmann2014}. One of the |
851 algorithms considerably, as noted in \cite{Sulzmann2014}. One of the |
844 advantages of having a simple specification and correctness proof is that |
852 advantages of having a simple specification and correctness proof is that |
845 the latter can be refined to prove the correctness of such simplification |
853 the latter can be refined to prove the correctness of such simplification |
846 steps. |
854 steps. While the simplification of regular expressions according to |
847 |
|
848 While the simplification of regular expressions according to |
|
849 rules like |
855 rules like |
850 |
856 |
851 \begin{equation}\label{Simpl} |
857 \begin{equation}\label{Simpl} |
852 \begin{array}{lcllcllcllcl} |
858 \begin{array}{lcllcllcllcl} |
853 @{term "ALT ZERO r"} & @{text "\<Rightarrow>"} & @{term r} \hspace{8mm}%\\ |
859 @{term "ALT ZERO r"} & @{text "\<Rightarrow>"} & @{term r} \hspace{8mm}%\\ |
928 is then recursively called with the simplified derivative, but before |
934 is then recursively called with the simplified derivative, but before |
929 we inject the character @{term c} into the value @{term v}, we need to rectify |
935 we inject the character @{term c} into the value @{term v}, we need to rectify |
930 @{term v} (that is construct @{term "f\<^sub>r v"}). Before we can establish the correctness |
936 @{term v} (that is construct @{term "f\<^sub>r v"}). Before we can establish the correctness |
931 of @{term "slexer"}, we need to show that simplification preserves the language |
937 of @{term "slexer"}, we need to show that simplification preserves the language |
932 and simplification preserves our POSIX relation once the value is rectified |
938 and simplification preserves our POSIX relation once the value is rectified |
933 (recall @{const "simp"} generates a regular expression / rectification function pair): |
939 (recall @{const "simp"} generates a (regular expression, rectification function) pair): |
934 |
940 |
935 \begin{lemma}\mbox{}\smallskip\\\label{slexeraux} |
941 \begin{lemma}\mbox{}\smallskip\\\label{slexeraux} |
936 \begin{tabular}{ll} |
942 \begin{tabular}{ll} |
937 (1) & @{thm L_fst_simp[symmetric]}\\ |
943 (1) & @{thm L_fst_simp[symmetric]}\\ |
938 (2) & @{thm[mode=IfThen] Posix_simp} |
944 (2) & @{thm[mode=IfThen] Posix_simp} |
1106 cannot be maintained as one descends into the induction. This is a problem |
1112 cannot be maintained as one descends into the induction. This is a problem |
1107 that occurs in a number of places in the proofs by Sulzmann and Lu. |
1113 that occurs in a number of places in the proofs by Sulzmann and Lu. |
1108 |
1114 |
1109 Although they do not give an explicit proof of the transitivity property, |
1115 Although they do not give an explicit proof of the transitivity property, |
1110 they give a closely related property about the existence of maximal |
1116 they give a closely related property about the existence of maximal |
1111 elements. They state that this can be verified by an induction on $r$. We |
1117 elements. They state that this can be verified by an induction on @{term r}. We |
1112 disagree with this as we shall show next in case of transitivity. The case |
1118 disagree with this as we shall show next in case of transitivity. The case |
1113 where the reasoning breaks down is the sequence case, say @{term "SEQ r\<^sub>1 r\<^sub>2"}. |
1119 where the reasoning breaks down is the sequence case, say @{term "SEQ r\<^sub>1 r\<^sub>2"}. |
1114 The induction hypotheses in this case are |
1120 The induction hypotheses in this case are |
1115 |
1121 |
1116 \begin{center} |
1122 \begin{center} |