i
authorChengsong
Sat, 08 Feb 2020 22:01:40 +0000
changeset 130 4d6f54c478b5
parent 129 576ddb23f596
child 131 b6984212cf87
i
etnms/etnms.tex
--- a/etnms/etnms.tex	Sat Feb 08 21:57:11 2020 +0000
+++ b/etnms/etnms.tex	Sat Feb 08 22:01:40 2020 +0000
@@ -183,11 +183,11 @@
 	$(r^*)\backslash c$           & $\dn$ & $(r\backslash c) \cdot r^*$\\
 \end{tabular}
 \end{center}
+\noindent
+And defines how a regular expression evolves into
+a new regular expression after all the string it contains
+is chopped off a certain head character $c$.
 
-%Assuming the classic notion of a
-%\emph{language} of a regular expression, written $L(\_)$, t
-
-\noindent
 The main property of the derivative operation is that
 
 \begin{center}
@@ -214,7 +214,7 @@
 \]
 
 \noindent
-Assuming the a string is givane as a sequence of characters, say $c_0c_1..c_n$, 
+Assuming the a string is given as a sequence of characters, say $c_0c_1..c_n$, 
 this algorithm presented graphically is as follows:
 
 \begin{equation}\label{graph:*}
@@ -380,6 +380,40 @@
 $a^*+a+\ONE$. Adding these more aggressive simplification rules help us
 to achieve the same size bound as that of the partial derivatives. 
 
+
+Suppose we apply simplification after each derivative step, and view
+these two operations as an atomic one: $a \backslash_{simp}\,c \dn
+\textit{simp}(a \backslash c)$. Then we can use the previous natural
+extension from derivative w.r.t.~character to derivative
+w.r.t.~string:%\comment{simp in  the [] case?}
+
+\begin{center}
+\begin{tabular}{lcl}
+$r \backslash_{simp} (c\!::\!s) $ & $\dn$ & $(r \backslash_{simp}\, c) \backslash_{simp}\, s$ \\
+$r \backslash_{simp} [\,] $ & $\dn$ & $r$
+\end{tabular}
+\end{center}
+
+\noindent
+we obtain an optimised version of the algorithm:
+
+ \begin{center}
+\begin{tabular}{lcl}
+  $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
+      $\textit{let}\;a = (r^\uparrow)\backslash_{simp}\, s\;\textit{in}$\\                
+  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
+  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
+  & & $\;\;\textit{else}\;\textit{None}$
+\end{tabular}
+\end{center}
+
+\noindent
+This algorithm keeps the regular expression size small, for example,
+with this simplification our previous $(a + aa)^*$ example's 8000 nodes
+will be reduced to just 6 and stays constant, no matter how long the
+input string is.
+
+
 In order to implement the idea of ``spilling out alternatives'' and to
 make them compatible with the $\textit{inj}$-mechanism, we use
 \emph{bitcodes}. They were first introduced by Sulzmann and Lu.
@@ -387,14 +421,14 @@
 
 \begin{center}
 		$b ::=   1 \mid  0 \qquad
-bs ::= [] \mid b:bs    
+bs ::= [] \mid b::bs    
 $
 \end{center}
 
 \noindent
 The $1$ and $0$ are not in bold in order to avoid 
 confusion with the regular expressions $\ZERO$ and $\ONE$. Bitcodes (or
-bit-lists) can be used to encode values (or incomplete values) in a
+bit-lists) can be used to encode values (or potentially incomplete values) in a
 compact form. This can be straightforwardly seen in the following
 coding function from values to bitcodes: 
 
@@ -735,37 +769,6 @@
 function, except that it also removes $\ZERO$s. Or in terms of regular expressions, it
 removes parentheses, for example changing $a+(b+c)$ into $a+b+c$.
 
-Suppose we apply simplification after each derivative step, and view
-these two operations as an atomic one: $a \backslash_{simp}\,c \dn
-\textit{simp}(a \backslash c)$. Then we can use the previous natural
-extension from derivative w.r.t.~character to derivative
-w.r.t.~string:%\comment{simp in  the [] case?}
-
-\begin{center}
-\begin{tabular}{lcl}
-$r \backslash_{simp} (c\!::\!s) $ & $\dn$ & $(r \backslash_{simp}\, c) \backslash_{simp}\, s$ \\
-$r \backslash_{simp} [\,] $ & $\dn$ & $r$
-\end{tabular}
-\end{center}
-
-\noindent
-we obtain an optimised version of the algorithm:
-
- \begin{center}
-\begin{tabular}{lcl}
-  $\textit{blexer\_simp}\;r\,s$ & $\dn$ &
-      $\textit{let}\;a = (r^\uparrow)\backslash_{simp}\, s\;\textit{in}$\\                
-  & & $\;\;\textit{if}\; \textit{bnullable}(a)$\\
-  & & $\;\;\textit{then}\;\textit{decode}\,(\textit{bmkeps}\,a)\,r$\\
-  & & $\;\;\textit{else}\;\textit{None}$
-\end{tabular}
-\end{center}
-
-\noindent
-This algorithm keeps the regular expression size small, for example,
-with this simplification our previous $(a + aa)^*$ example's 8000 nodes
-will be reduced to just 6 and stays constant, no matter how long the
-input string is.