cst_tests: changeset 103:aeb0bc2d1812

--- a/Spiral.scala	Sat Jan 11 22:47:11 2020 +0000
+++ b/Spiral.scala	Sun Jan 12 22:49:23 2020 +0000
@@ -450,7 +450,7 @@
       }
     }
     case ASEQ(bs, r1, r2) => {
-      ((elem("[") left_align bstostick(bs)) beside  bits_print(r1) beside elem("~") beside bits_print(r2)) 
+      ((elem("[") left_align bstostick(bs)) beside  bits_print(r1) beside elem("~") beside bits_print(r2) beside (elem("]") above elem(" "))) 
     }
     case ASTAR(bs, r) => {
       r match {

--- a/etnms/etnms.tex	Sat Jan 11 22:47:11 2020 +0000
+++ b/etnms/etnms.tex	Sun Jan 12 22:49:23 2020 +0000
@@ -343,34 +343,111 @@
 are very closely related, but not identical.
 For example, let $r$ be the regular expression
 $(a+b)(a+a*)$ and $s$  be the string $aa$, then
+both $\erase (\rup\backslash_{simp} \, s)$ and $\erase (\simp (\rup\backslash s))$
+are $\ONE + a^*$. However, without $\erase$ 
 \begin{center}
 $\rup\backslash_{simp} \, s$ is equal to $_0(_0\ONE +_{11}a^*)$
 \end{center}
 \noindent
 whereas
 \begin{center}
-$\rup\backslash \, s$ is equal to $(_{00}\ONE +_{011}a^*)$
+$\simp(\rup\backslash  s)$ is equal to $(_{00}\ONE +_{011}a^*)$
 \end{center}
 \noindent
+For the sake of visual simplicity, we use numbers to denote the bits
+in bitcodes as we have previously defined for annotated 
+regular expressions. $\S$ is replaced by 
+subscript $_1$ and $\Z$ by $_0$.
+
 Two "rules" might be inferred from the above example.
+
 First, after erasing the bits the two regular expressions
 are exactly the same: both become $1+a^*$. Here the 
-function $\simp$ exhibits the "once equals many times"
+function $\simp$ exhibits the "one in the end equals many times
+at the front"
 property: one simplification in the end causes the 
 same regular expression structure as
- successive simplifications done alongside derivatives.
+successive simplifications done alongside derivatives.
+$\rup\backslash_{simp} \, s$ unfolds to 
+$\simp((\simp(r\backslash a))\backslash a)$
+and $\simp(\rup\backslash s)$ unfolds to 
+$\simp((r\backslash a)\backslash a)$. The one simplification
+in the latter causes the resulting regular expression to 
+become $1+a^*$, exactly the same as the former with
+two simplifications.
+
 Second, the bit-codes are different, but they are essentially
 the same: if we push the outmost bits ${\bf_0}(_0\ONE +_{11}a^*)$ of $\rup\backslash_{simp} \, s$
 inside then we get $(_{00}\ONE +_{011}a^*)$, exactly the 
 same as that of $\rup\backslash \, s$. And this difference 
 does not matter when we try to apply $\bmkeps$ or $\retrieve$
-to it.\\
-If we look into the difference above, we could see why:
-during the first derivative operation, 
-$\rup\backslash a=(_0\ONE  + \ZERO)(_0a  +  _1a^*)$ gets into a sequence
-with the first part being nullable, but not the second part.
+to it. This seems a good news if we want to use $\retrieve$
+to prove things.
+
+If we look into the difference above, we could see that the
+difference is not fundamental: the bits are just being moved
+around in a way that does not hurt the correctness.
+During the first derivative operation, 
+$\rup\backslash a=(_0\ONE  + \ZERO)(_0a  +  _1a^*)$  is
+in the form of a sequence regular expression with the first
+part being nullable. 
+Recall the simplification function definition:
+\begin{center}
+  \begin{tabular}{@{}lcl@{}}
+   
+  $\textit{simp} \; (\textit{SEQ}\;bs\,a_1\,a_2)$ & $\dn$ & $ (\textit{simp} \; a_1, \textit{simp}  \; a_2) \; \textit{match} $ \\
+   &&$\quad\textit{case} \; (\ZERO, \_) \Rightarrow  \ZERO$ \\
+   &&$\quad\textit{case} \; (\_, \ZERO) \Rightarrow  \ZERO$ \\
+   &&$\quad\textit{case} \;  (\ONE, a_2') \Rightarrow  \textit{fuse} \; bs \;  a_2'$ \\
+   &&$\quad\textit{case} \; (a_1', \ONE) \Rightarrow  \textit{fuse} \; bs \;  a_1'$ \\
+   &&$\quad\textit{case} \; (a_1', a_2') \Rightarrow  \textit{SEQ} \; bs \; a_1' \;  a_2'$ \\
+
+  $\textit{simp} \; (\textit{ALTS}\;bs\,as)$ & $\dn$ & $\textit{distinct}( \textit{flatten} ( \textit{map simp as})) \; \textit{match} $ \\
+  &&$\quad\textit{case} \; [] \Rightarrow  \ZERO$ \\
+   &&$\quad\textit{case} \; a :: [] \Rightarrow  \textit{fuse bs a}$ \\
+   &&$\quad\textit{case} \;  as' \Rightarrow  \textit{ALTS}\;bs\;as'$\\ 
+
+   $\textit{simp} \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$   
+\end{tabular}    
+\end{center}    
+
+\noindent
+If we call $\simp$ on $\rup$, just as $\backslash_{simp}$
+requires, then we would go throught the third clause of 
+the sequence case:$\quad\textit{case} \;  (\ONE, a_2') \Rightarrow  \textit{fuse} \; bs \;  a_2'$.
+The $\ZERO$ of $(_0\ONE  + \ZERO)$ is simplified away and 
+$_0\ONE$ merged into $_0a  +  _1a^*$ by simply
+putting its bits($_0$) to the front of the second component:
+ ${\bf_0}(_0a  +  _1a^*)$. 
+ After a second derivative operation,
+ namely, $(_0(_0a  +  _1a^*))\backslash a$, we get 
+ $
+ _0(_0 \ONE  +  _1(_1\ONE \cdot a^*))
+ $, and this simplifies to $_0(_0 \ONE  +  _{11} a^*)$
+ by the third clause of the alternative case:
+ $\quad\textit{case} \;  as' \Rightarrow  \textit{ALTS}\;bs\;as'$.
+The outmost bit $_0$ remains unchanged and stays with 
+the outmost regular expression. However, things are a bit
+different when it comes to $\simp(\rup\backslash \, s)$, because
+without simplification,  first term of the sequence 
+$\rup\backslash a=(_0\ONE  + \ZERO)(_0a  +  _1a^*)$ 
+is not merged into the second component
+and is nullable.
+Therefore $((_0\ONE  + \ZERO)(_0a  +  _1a^*))\backslash a$ splits into
+$([(\ZERO + \ZERO)\cdot(_0a  +  _1a^*)] + _0( _0\ONE  + _1[_1\ONE \cdot a^*]))$.
+After these two successive derivatives without simplification,
+we apply $\simp$ to this regular expression, which goes through
+the alternative clause, and each component of $([(\ZERO + \ZERO)\cdot(_0a  +  _1a^*)] + _0( _0\ONE  + _1[_1\ONE \cdot a^*]))$ will be simplified, giving us the list:$[\ZERO, _0(_0\ONE  + _{11}a^*]$,this
+list is then flattened--for
+$([(\ZERO + \ZERO)\cdot(_0a  +  _1a^*)]$ it will be simplified into $\ZERO$
+and then thrown away by $\textit{flatten}$; $ _0( _0\ONE  + _1[_1\ONE \cdot a^*]))$
+ becomes  $ _{00}\ONE  + _{011}a^*]))$ because flatten opens up the alternative
+ $\ONE + a^*$ and fuses the front bit(s) $_0$ to the front of $_0\ONE $ and $_{11}a^*$
+ and get $_{00}$ and $_{011}$.
+ %CONSTRUCTION SITE
 and we can use this to construct a set of examples based 
-on this type of behaviour of two operations.
+on this type of behaviour of two operations:
+$r_1$
 that is to say, despite the bits are being moved around on the regular expression
 (difference in bits), the structure of the (unannotated)regular expression
 after one simplification is exactly the same after the 
@@ -382,7 +459,7 @@
 $(a+b)(a+a*)$,
 if we do derivative with respect to string $aa$,
 we get
-%TODO
+
 sdddddr does not equal sdsdsdsr sometimes.\\
 For example,

author	Chengsong
	Sun, 12 Jan 2020 22:49:23 +0000
changeset 103	aeb0bc2d1812
parent 102	9c3c118896bb
child 104	e7ba4da53930
child 105	317a7d54ebcc

Spiral.scala		file \| annotate \| diff \| comparison \| revisions
etnms/etnms.tex		file \| annotate \| diff \| comparison \| revisions