--- a/ninems/ninems.tex Mon Jul 22 23:01:00 2019 +0100
+++ b/ninems/ninems.tex Tue Jul 23 09:26:22 2019 +0100
@@ -586,7 +586,7 @@
for conciseness).
Before $\textit{inj}$ comes into play,
our lexer first builds derivative using string $abc$ (we simplified some regular expressions like
-$0 \cdot b$ to $0$ for conciseness; we also omit parentheses if
+$\ZERO \cdot b$ to $\ZERO$ for conciseness; we also omit parentheses if
they are clear from the context):
%Similarly, we allow
%$\textit{ALT}$ to take a list of regular expressions as an argument
@@ -746,9 +746,9 @@
\begin{center}
\begin{tabular}{lcl}
- $\textit{pder} \; c \; 0$ & $\dn$ & $\emptyset$\\
- $\textit{pder} \; c \; 1$ & $\dn$ & $\emptyset$ \\
- $\textit{pder} \; c \; d$ & $\dn$ & $\textit{if} \; c \,=\, d \; \{ 1 \} \; \textit{else} \; \emptyset$ \\
+ $\textit{pder} \; c \; \ZERO$ & $\dn$ & $\emptyset$\\
+ $\textit{pder} \; c \; \ONE$ & $\dn$ & $\emptyset$ \\
+ $\textit{pder} \; c \; d$ & $\dn$ & $\textit{if} \; c \,=\, d \; \{ \ONE \} \; \textit{else} \; \emptyset$ \\
$\textit{pder} \; c \; r_1+r_2$ & $\dn$ & $pder \; c \; r_1 \cup pder \; c \; r_2$ \\
$\textit{pder} \; c \; r_1 \cdot r_2$ & $\dn$ & $\textit{if} \; nullable \; r_1 $\\
& & $\textit{then} \; \{ r \cdot r_2 \mid r \in pder \; c \; r_1 \} \cup pder \; c \; r_2 \;$\\
@@ -929,7 +929,8 @@
annotated regular expressions. This derivative operation is the same as
what we previously have for the simple regular expressions, except that
we beed to take special care of the bitcodes:\comment{You need to be consitent with ALTS and ALT; ALT is just
-an abbreviation; derivations and so on are defined for ALTS}
+an abbreviation; derivations and so on are defined for ALTS}\comment{no this is not the case,
+ALT for 2 regexes, ALTS for a list}
%\begin{definition}{bder}
\begin{center}
@@ -1039,16 +1040,16 @@
\begin{tabular}{@{}lcl@{}}
$\textit{simp} \; (\textit{SEQ}\;bs\,a_1\,a_2)$ & $\dn$ & $ (\textit{simp} \; a_1, \textit{simp} \; a_2) \; \textit{match} $ \\
- &&$\quad\textit{case} \; (0, \_) \Rightarrow 0$ \\
- &&$\quad\textit{case} \; (\_, 0) \Rightarrow 0$ \\
- &&$\quad\textit{case} \; (1, a_2') \Rightarrow \textit{fuse} \; bs \; a_2'$ \\
- &&$\quad\textit{case} \; (a_1', 1) \Rightarrow \textit{fuse} \; bs \; a_1'$ \\
+ &&$\quad\textit{case} \; (\ZERO, \_) \Rightarrow \ZERO$ \\
+ &&$\quad\textit{case} \; (\_, \ZERO) \Rightarrow \ZERO$ \\
+ &&$\quad\textit{case} \; (\ONE, a_2') \Rightarrow \textit{fuse} \; bs \; a_2'$ \\
+ &&$\quad\textit{case} \; (a_1', \ONE) \Rightarrow \textit{fuse} \; bs \; a_1'$ \\
&&$\quad\textit{case} \; (a_1', a_2') \Rightarrow \textit{SEQ} \; bs \; a_1' \; a_2'$ \\
$\textit{simp} \; (\textit{ALTS}\;bs\,as)$ & $\dn$ & $\textit{distinct}( \textit{flatten} ( \textit{map simp as})) \; \textit{match} $ \\
- &&$\quad\textit{case} \; [] \Rightarrow 0$ \\
+ &&$\quad\textit{case} \; [] \Rightarrow \ZERO$ \\
&&$\quad\textit{case} \; a :: [] \Rightarrow \textit{fuse bs a}$ \\
- &&$\quad\textit{case} \; as' \Rightarrow \textit{ALT}\;bs\;as'$\\
+ &&$\quad\textit{case} \; as' \Rightarrow \textit{ALTS}\;bs\;as'$\\
$\textit{simp} \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$
\end{tabular}
@@ -1064,12 +1065,12 @@
auxiliary functions flatten and distinct to open up nested
$\textit{ALTS}$ and reduce as many duplicates as possible. Function
distinct keeps the first occurring copy only and remove all later ones
-when detected duplicates. Function flatten opens up nested \textit{ALT}.
+when detected duplicates. Function flatten opens up nested \textit{ALTS}.
Its recursive definition is given below:
\begin{center}
\begin{tabular}{@{}lcl@{}}
- $\textit{flatten} \; (\textit{ALT}\;bs\,as) :: as'$ & $\dn$ & $(\textit{map} \;
+ $\textit{flatten} \; (\textit{ALTS}\;bs\,as) :: as'$ & $\dn$ & $(\textit{map} \;
(\textit{fuse}\;bs)\; \textit{as}) \; @ \; \textit{flatten} \; as' $ \\
$\textit{flatten} \; \textit{ZERO} :: as'$ & $\dn$ & $ \textit{flatten} \; as' $ \\
$\textit{flatten} \; a :: as'$ & $\dn$ & $a :: \textit{flatten} \; as'$ \quad(otherwise)
@@ -1090,7 +1091,7 @@
\begin{center}
\begin{tabular}{lcl}
$r \backslash_{simp} (c\!::\!s) $ & $\dn$ & $(r \backslash_{simp}\, c) \backslash_{simp}\, s$ \\
-$r \backslash [\,] $ & $\dn$ & $r$
+$r \backslash_{simp} [\,] $ & $\dn$ & $r$
\end{tabular}
\end{center}
@@ -1142,10 +1143,15 @@
straightforward to establish that after one simplification step, the part of a
nullable derivative that corresponds to a POSIX value remains intact and can
still be collected, in other words, we can show that\comment{Double-check....I
-think this is not the case}
+think this is not the case}\comment{If i remember correctly, you have proved this lemma.
+I feel this is indeed not true because you might place arbitrary
+bits on the regex r, however if this is the case, did i remember wrongly that
+you proved something like simplification does not affect $\textit{bmkeps}$ results?
+Anyway, i have amended this a little bit so it does not allow arbitrary bits attached
+to a regex. Maybe it works now.}
\begin{center}
-$\textit{bmkeps} \; r = \textit{bmkeps} \; \textit{simp} \; r\;( r\; \textit{nullable})$
+$\textit{bmkeps} \; a = \textit{bmkeps} \; \textit{bsimp} \; a\;( a\; \textit{bnullable} and \textit{decode}(r, \textit{bmkeps}(a)) is a \textit{POSIX} value)$
\end{center}
\noindent
@@ -1159,47 +1165,63 @@
\noindent That is, if we do derivative on regular expression $r$ and then
simplify it, and repeat this process until we exhaust the string, we get a
-regular expression $r''$ that provides the POSIX matching as the result $r'$ of
-the normal derivative algorithm which only does derivative operation repeatedly
-and has no simplification at all. This might seem at first glance very
-unintuitive, as the $r'$ is exponentially larger than $r''$. But this can be
-explained in the following way: we are pruning away the possible matches that
-are not POSIX. Since there are exponentially non-POSIX matchings and only 1
-POSIX matching, it is understandable that our $r''$ can be a lot smaller. we
-can still provide the same POSIX value if there is one. This is not as
-straightforward as the previous proposition, as the two regular expressions $r$
-and $\textit{simp}\; r$ might become very different regular expressions after
-repeated application of $\textit{simp}$ and derivative. The crucial point is
-to find the indispensable information of a regular expression and how it is
-kept intact during simplification so that it performs as good as a regular
-expression that has not been simplified in the subsequent derivative
-operations. To aid this, we use the helping function retrieve described by
-Sulzmann and Lu: \\definition of retrieve\\
-
-This function assembled the bitcode that corresponds to a parse tree for
-how the current derivative matches the suffix of the string(the
-characters that have not yet appeared, but is stored in the value).
-Sulzmann and Lu used this to connect the bitcoded algorithm to the older
-algorithm by the following equation:
+regular expression $r''$($\textit{LHS}$) that provides the POSIX matching
+information, which is exactly the same as the result $r'$($\textit{RHS}$ of the
+normal derivative algorithm that only does derivative repeatedly and has no
+simplification at all. This might seem at first glance very unintuitive, as
+the $r'$ is exponentially larger than $r''$, but can be explained in the
+following way: we are pruning away the possible matches that are not POSIX.
+Since there are exponentially non-POSIX matchings and only 1 POSIX matching, it
+is understandable that our $r''$ can be a lot smaller. we can still provide
+the same POSIX value if there is one. This is not as straightforward as the
+previous proposition, as the two regular expressions $r'$ and $r''$ might have
+become very different regular expressions. The crucial point is to find the
+$\textit{POSIX}$ information of a regular expression and how it is modified,
+augmented and propagated
+during simplification in parallel with the regularr expression that
+has not been simplified in the subsequent derivative operations. To aid this,
+we use the helping function retrieve described by Sulzmann and Lu: \\definition
+of retrieve TODO\\
+This function assembles the bitcode that corresponds to a parse tree for how
+the current derivative matches the suffix of the string(the characters that
+have not yet appeared, but will appear as the successive derivatives go on,
+how do we get this "future" information? By the value $v$, which is
+computed by a pass of the algorithm that uses
+$inj$ as described in the previous section). Sulzmann and Lu used this
+to connect the bitcoded algorithm to the older algorithm by the following
+equation:
\begin{center} $inj \;a\; c \; v = \textit{decode} \; (\textit{retrieve}\;
- ((\textit{internalise}\; r)\backslash_{simp} c) v)$ \end{center} A
- little fact that needs to be stated to help comprehension:
- \begin{center} $r^\uparrow = a$($a$ stands for $\textit{annotated}).$
- \end{center} Ausaf and Urban also used this fact to prove the
- correctness of bitcoded algorithm without simplification. Our purpose
- of using this, however, is try to establish \\ $ \textit{retrieve} \;
- a \; v \;=\; \textit{retrieve} \; \textit{simp}(a) \; v'.$\\ The idea
- is that using $v'$, a simplified version of $v$ that possibly had gone
- through the same simplification step as $\textit{simp}(a)$ we are
- still able to extract the bit-sequence that gives the same parsing
- information as the unsimplified one. After establishing this, we
- might be able to finally bridge the gap of proving\\
- $\textit{retrieve} \; r \backslash s \; v = \;\textit{retrieve} \;
- \textit{simp}(r) \backslash s \; v'$\\ and subsequently\\
- $\textit{retrieve} \; r \backslash s \; v\; = \; \textit{retrieve} \;
- r \backslash_{simp} s \; v'$.\\ This proves that our simplified
- version of regular expression still contains all the bitcodes needed.
+ ((\textit{internalise}\; r)\backslash_{simp} c) v)$
+ \end{center}
+ A little fact that needs to be stated to help comprehension:
+\begin{center}
+ $r^\uparrow = a$($a$ stands for $\textit{annotated}).$
+\end{center}
+Ausaf and Urban also used this fact to prove the
+correctness of bitcoded algorithm without simplification. Our purpose
+of using this, however, is to establish
+\begin{center}
+$ \textit{retrieve} \;
+a \; v \;=\; \textit{retrieve} \; \textit{simp}(a) \; v'.$
+\end{center}
+The idea
+is that using $v'$, a simplified version of $v$ that possibly had gone
+through the same simplification step as $\textit{simp}(a)$ we are
+able to extract the bit-sequence that gives the same parsing
+information as the unsimplified one. After establishing this, we
+might be able to finally bridge the gap of proving
+\begin{center}
+$\textit{retrieve} \; r \backslash s \; v = \;\textit{retrieve} \;
+\textit{simp}(r) \backslash s \; v'$
+\end{center}
+and subsequently
+\begin{center}
+$\textit{retrieve} \; r \backslash s \; v\; = \; \textit{retrieve} \;
+r \backslash_{simp} s \; v'$.
+\end{center}
+This proves that our simplified
+version of regular expression still contains all the bitcodes needed.
The second task is to speed up the more aggressive simplification.