lexing: comparison thys2/Journal/Paper.thy

equal deleted inserted replaced

-:f65444d29e74
+:78cc255e286f
 @{thm [mode=IfThen] blexer_correctness}
 \end{lemma}
 However what is not certain is whether we can add simplification
 to the bit-coded algorithm, without breaking the correct lexing output.
-This might sound trivial in the case of producing a YES/NO answer,
-but once we require a lexing output to be produced (which is required
-in applications like compiler front-end, malicious attack domain extraction,
+The reason that we do need to add a simplification phase
-etc.), it is not straightforward if we still extract what is needed according
+after each derivative step of  $\textit{blexer}$ is
-to the POSIX standard.
+because it produces intermediate
+regular expressions that can grow exponentially.
-By simplification, we mean specifically the following rules:
+For example, the regular expression $(a+aa)^*$ after taking
+derivative against just 10 $a$s will have size 8192.
-\begin{center}
-\begin{tabular}{lcl}
+%TODO: add figure for this?
-@{thm[mode=Axiom] rrewrite.intros(1)[of "bs" "r\<^sub>2"]}\\
-@{thm[mode=Axiom] rrewrite.intros(2)[of "bs" "r\<^sub>1"]}\\
+\begin{lemma}
-@{thm[mode=Axiom] rrewrite.intros(3)[of  "bs" "bs\<^sub>1" "r\<^sub>1"]}\\
+@{thm blexer_simp_def}
-@{thm[mode=Rule] rrewrite.intros(4)[of  "r\<^sub>1" "r\<^sub>2" "bs" "r\<^sub>3"]}\\
+\end{lemma}
-@{thm[mode=Rule] rrewrite.intros(5)[of "bs" "r\<^sub>2"]}\\
-@{thm[mode=Rule] rrewrite.intros(6)[of "bs" "r\<^sub>1"]}\\
-@{thm[mode=Axiom] rrewrite.intros(7)[of "bs" "r\<^sub>2"]}\\
-@{thm[mode=Axiom] rrewrite.intros(8)[of "bs" "r\<^sub>1"]}\\
-@{thm[mode=Axiom] rrewrite.intros(9)[of "bs" "r\<^sub>2"]}\\
-@{thm[mode=Axiom] rrewrite.intros(10)[of "bs" "r\<^sub>1"]}\\
-@{thm[mode=Axiom] rrewrite.intros(11)[of "bs" ]}\\
-@{thm[mode=Axiom] rrewrite.intros(12)[of "bs" "r\<^sub>1"]}\\
-@{thm[mode=Rule] rrewrite.intros(13)[of "a\<^sub>1" "a\<^sub>2" "bs" "rs\<^sub>a" "rs\<^sub>b" "rs\<^sub>c"]}\\
-\end{tabular}
-\end{center}
-And these can be made compact by the following simplification function:
 \begin{center}
 \begin{tabular}{lcl}
 @{thm (lhs) bsimp.simps(1)[of  "bs" "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) bsimp.simps(1)[of  "bs" "r\<^sub>1" "r\<^sub>2"]}\\
 @{thm (lhs) bsimp.simps(2)} & $\dn$ & @{thm (rhs) bsimp.simps(2)}\\
 @{thm (lhs) bsimp.simps(3)} & $\dn$ & @{thm (rhs) bsimp.simps(3)}\\
 @{thm (lhs) intern.simps(4)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) intern.simps(4)[of "r\<^sub>1" "r\<^sub>2"]}\\
 \end{tabular}
 \end{center}
+This might sound trivial in the case of producing a YES/NO answer,
+but once we require a lexing output to be produced (which is required
+in applications like compiler front-end, malicious attack domain extraction,
+etc.), it is not straightforward if we still extract what is needed according
+to the POSIX standard.
+By simplification, we mean specifically the following rules:
+\begin{center}
+\begin{tabular}{lcl}
+@{thm[mode=Axiom] rrewrite.intros(1)[of "bs" "r\<^sub>2"]}\\
+@{thm[mode=Axiom] rrewrite.intros(2)[of "bs" "r\<^sub>1"]}\\
+@{thm[mode=Axiom] rrewrite.intros(3)[of  "bs" "bs\<^sub>1" "r\<^sub>1"]}\\
+@{thm[mode=Rule] rrewrite.intros(4)[of  "r\<^sub>1" "r\<^sub>2" "bs" "r\<^sub>3"]}\\
+@{thm[mode=Rule] rrewrite.intros(5)[of "r\<^sub>3" "r\<^sub>4" "bs" "r\<^sub>1"]}\\
+@{thm[mode=Rule] rrewrite.intros(6)[of "r" "r'" "bs" "rs\<^sub>1" "rs\<^sub>2"]}\\
+@{thm[mode=Axiom] rrewrite.intros(7)[of "bs" "rs\<^sub>a" "rs\<^sub>b"]}\\
+@{thm[mode=Axiom] rrewrite.intros(8)[of "bs" "rs\<^sub>a" "bs\<^sub>1" "rs\<^sub>1" "rs\<^sub>b"]}\\
+@{thm[mode=Axiom] rrewrite.intros(9)[of "bs" "r\<^sub>2"]}\\
+@{thm[mode=Axiom] rrewrite.intros(10)[of "bs" "r\<^sub>1"]}\\
+@{thm[mode=Axiom] rrewrite.intros(11)[of "bs" ]}\\
+@{thm[mode=Axiom] rrewrite.intros(12)[of "bs" "r\<^sub>1"]}\\
+@{thm[mode=Rule] rrewrite.intros(13)[of "a\<^sub>1" "a\<^sub>2" "bs" "rs\<^sub>a" "rs\<^sub>b" "rs\<^sub>c"]}\\
+\end{tabular}
+\end{center}
+And these can be made compact by the following simplification function:
 where the function $\mathit{bsimp_AALTs}$
 The core idea of the proof is that two regular expressions,
 if "isomorphic" up to a finite number of rewrite steps, will
 This isomorphic relation implies a property that leads to the
 correctness result:
 if two (nullable) regular expressions are "rewritable" in many steps
 from one another,
-then a call to function $textit{bmkeps}$ gives the same
+then a call to function $\textit{bmkeps}$ gives the same
 bit-sequence :
 \begin{lemma}
 @{thm [mode=IfThen] rewrites_bmkeps}
 \end{lemma}

changeset 372	78cc255e286f
parent 371	f65444d29e74
child 376	664322da08fe