--- a/etnms/etnms.tex Sat Jan 11 22:47:11 2020 +0000
+++ b/etnms/etnms.tex Sun Jan 12 22:49:23 2020 +0000
@@ -343,34 +343,111 @@
are very closely related, but not identical.
For example, let $r$ be the regular expression
$(a+b)(a+a*)$ and $s$ be the string $aa$, then
+both $\erase (\rup\backslash_{simp} \, s)$ and $\erase (\simp (\rup\backslash s))$
+are $\ONE + a^*$. However, without $\erase$
\begin{center}
$\rup\backslash_{simp} \, s$ is equal to $_0(_0\ONE +_{11}a^*)$
\end{center}
\noindent
whereas
\begin{center}
-$\rup\backslash \, s$ is equal to $(_{00}\ONE +_{011}a^*)$
+$\simp(\rup\backslash s)$ is equal to $(_{00}\ONE +_{011}a^*)$
\end{center}
\noindent
+For the sake of visual simplicity, we use numbers to denote the bits
+in bitcodes as we have previously defined for annotated
+regular expressions. $\S$ is replaced by
+subscript $_1$ and $\Z$ by $_0$.
+
Two "rules" might be inferred from the above example.
+
First, after erasing the bits the two regular expressions
are exactly the same: both become $1+a^*$. Here the
-function $\simp$ exhibits the "once equals many times"
+function $\simp$ exhibits the "one in the end equals many times
+at the front"
property: one simplification in the end causes the
same regular expression structure as
- successive simplifications done alongside derivatives.
+successive simplifications done alongside derivatives.
+$\rup\backslash_{simp} \, s$ unfolds to
+$\simp((\simp(r\backslash a))\backslash a)$
+and $\simp(\rup\backslash s)$ unfolds to
+$\simp((r\backslash a)\backslash a)$. The one simplification
+in the latter causes the resulting regular expression to
+become $1+a^*$, exactly the same as the former with
+two simplifications.
+
Second, the bit-codes are different, but they are essentially
the same: if we push the outmost bits ${\bf_0}(_0\ONE +_{11}a^*)$ of $\rup\backslash_{simp} \, s$
inside then we get $(_{00}\ONE +_{011}a^*)$, exactly the
same as that of $\rup\backslash \, s$. And this difference
does not matter when we try to apply $\bmkeps$ or $\retrieve$
-to it.\\
-If we look into the difference above, we could see why:
-during the first derivative operation,
-$\rup\backslash a=(_0\ONE + \ZERO)(_0a + _1a^*)$ gets into a sequence
-with the first part being nullable, but not the second part.
+to it. This seems a good news if we want to use $\retrieve$
+to prove things.
+
+If we look into the difference above, we could see that the
+difference is not fundamental: the bits are just being moved
+around in a way that does not hurt the correctness.
+During the first derivative operation,
+$\rup\backslash a=(_0\ONE + \ZERO)(_0a + _1a^*)$ is
+in the form of a sequence regular expression with the first
+part being nullable.
+Recall the simplification function definition:
+\begin{center}
+ \begin{tabular}{@{}lcl@{}}
+
+ $\textit{simp} \; (\textit{SEQ}\;bs\,a_1\,a_2)$ & $\dn$ & $ (\textit{simp} \; a_1, \textit{simp} \; a_2) \; \textit{match} $ \\
+ &&$\quad\textit{case} \; (\ZERO, \_) \Rightarrow \ZERO$ \\
+ &&$\quad\textit{case} \; (\_, \ZERO) \Rightarrow \ZERO$ \\
+ &&$\quad\textit{case} \; (\ONE, a_2') \Rightarrow \textit{fuse} \; bs \; a_2'$ \\
+ &&$\quad\textit{case} \; (a_1', \ONE) \Rightarrow \textit{fuse} \; bs \; a_1'$ \\
+ &&$\quad\textit{case} \; (a_1', a_2') \Rightarrow \textit{SEQ} \; bs \; a_1' \; a_2'$ \\
+
+ $\textit{simp} \; (\textit{ALTS}\;bs\,as)$ & $\dn$ & $\textit{distinct}( \textit{flatten} ( \textit{map simp as})) \; \textit{match} $ \\
+ &&$\quad\textit{case} \; [] \Rightarrow \ZERO$ \\
+ &&$\quad\textit{case} \; a :: [] \Rightarrow \textit{fuse bs a}$ \\
+ &&$\quad\textit{case} \; as' \Rightarrow \textit{ALTS}\;bs\;as'$\\
+
+ $\textit{simp} \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$
+\end{tabular}
+\end{center}
+
+\noindent
+If we call $\simp$ on $\rup$, just as $\backslash_{simp}$
+requires, then we would go throught the third clause of
+the sequence case:$\quad\textit{case} \; (\ONE, a_2') \Rightarrow \textit{fuse} \; bs \; a_2'$.
+The $\ZERO$ of $(_0\ONE + \ZERO)$ is simplified away and
+$_0\ONE$ merged into $_0a + _1a^*$ by simply
+putting its bits($_0$) to the front of the second component:
+ ${\bf_0}(_0a + _1a^*)$.
+ After a second derivative operation,
+ namely, $(_0(_0a + _1a^*))\backslash a$, we get
+ $
+ _0(_0 \ONE + _1(_1\ONE \cdot a^*))
+ $, and this simplifies to $_0(_0 \ONE + _{11} a^*)$
+ by the third clause of the alternative case:
+ $\quad\textit{case} \; as' \Rightarrow \textit{ALTS}\;bs\;as'$.
+The outmost bit $_0$ remains unchanged and stays with
+the outmost regular expression. However, things are a bit
+different when it comes to $\simp(\rup\backslash \, s)$, because
+without simplification, first term of the sequence
+$\rup\backslash a=(_0\ONE + \ZERO)(_0a + _1a^*)$
+is not merged into the second component
+and is nullable.
+Therefore $((_0\ONE + \ZERO)(_0a + _1a^*))\backslash a$ splits into
+$([(\ZERO + \ZERO)\cdot(_0a + _1a^*)] + _0( _0\ONE + _1[_1\ONE \cdot a^*]))$.
+After these two successive derivatives without simplification,
+we apply $\simp$ to this regular expression, which goes through
+the alternative clause, and each component of $([(\ZERO + \ZERO)\cdot(_0a + _1a^*)] + _0( _0\ONE + _1[_1\ONE \cdot a^*]))$ will be simplified, giving us the list:$[\ZERO, _0(_0\ONE + _{11}a^*]$,this
+list is then flattened--for
+$([(\ZERO + \ZERO)\cdot(_0a + _1a^*)]$ it will be simplified into $\ZERO$
+and then thrown away by $\textit{flatten}$; $ _0( _0\ONE + _1[_1\ONE \cdot a^*]))$
+ becomes $ _{00}\ONE + _{011}a^*]))$ because flatten opens up the alternative
+ $\ONE + a^*$ and fuses the front bit(s) $_0$ to the front of $_0\ONE $ and $_{11}a^*$
+ and get $_{00}$ and $_{011}$.
+ %CONSTRUCTION SITE
and we can use this to construct a set of examples based
-on this type of behaviour of two operations.
+on this type of behaviour of two operations:
+$r_1$
that is to say, despite the bits are being moved around on the regular expression
(difference in bits), the structure of the (unannotated)regular expression
after one simplification is exactly the same after the
@@ -382,7 +459,7 @@
$(a+b)(a+a*)$,
if we do derivative with respect to string $aa$,
we get
-%TODO
+
sdddddr does not equal sdsdsdsr sometimes.\\
For example,