etnms/etnms.tex
changeset 103 aeb0bc2d1812
parent 102 9c3c118896bb
child 104 e7ba4da53930
child 105 317a7d54ebcc
equal deleted inserted replaced
102:9c3c118896bb 103:aeb0bc2d1812
   341 regular expression(just like "erasing" the bits). Its definition is omitted.
   341 regular expression(just like "erasing" the bits). Its definition is omitted.
   342 $\rup\backslash_{simp} \, s$ and $\simp(\rup\backslash s)$
   342 $\rup\backslash_{simp} \, s$ and $\simp(\rup\backslash s)$
   343 are very closely related, but not identical.
   343 are very closely related, but not identical.
   344 For example, let $r$ be the regular expression
   344 For example, let $r$ be the regular expression
   345 $(a+b)(a+a*)$ and $s$  be the string $aa$, then
   345 $(a+b)(a+a*)$ and $s$  be the string $aa$, then
       
   346 both $\erase (\rup\backslash_{simp} \, s)$ and $\erase (\simp (\rup\backslash s))$
       
   347 are $\ONE + a^*$. However, without $\erase$ 
   346 \begin{center}
   348 \begin{center}
   347 $\rup\backslash_{simp} \, s$ is equal to $_0(_0\ONE +_{11}a^*)$
   349 $\rup\backslash_{simp} \, s$ is equal to $_0(_0\ONE +_{11}a^*)$
   348 \end{center}
   350 \end{center}
   349 \noindent
   351 \noindent
   350 whereas
   352 whereas
   351 \begin{center}
   353 \begin{center}
   352 $\rup\backslash \, s$ is equal to $(_{00}\ONE +_{011}a^*)$
   354 $\simp(\rup\backslash  s)$ is equal to $(_{00}\ONE +_{011}a^*)$
   353 \end{center}
   355 \end{center}
   354 \noindent
   356 \noindent
       
   357 For the sake of visual simplicity, we use numbers to denote the bits
       
   358 in bitcodes as we have previously defined for annotated 
       
   359 regular expressions. $\S$ is replaced by 
       
   360 subscript $_1$ and $\Z$ by $_0$.
       
   361 
   355 Two "rules" might be inferred from the above example.
   362 Two "rules" might be inferred from the above example.
       
   363 
   356 First, after erasing the bits the two regular expressions
   364 First, after erasing the bits the two regular expressions
   357 are exactly the same: both become $1+a^*$. Here the 
   365 are exactly the same: both become $1+a^*$. Here the 
   358 function $\simp$ exhibits the "once equals many times"
   366 function $\simp$ exhibits the "one in the end equals many times
       
   367 at the front"
   359 property: one simplification in the end causes the 
   368 property: one simplification in the end causes the 
   360 same regular expression structure as
   369 same regular expression structure as
   361  successive simplifications done alongside derivatives.
   370 successive simplifications done alongside derivatives.
       
   371 $\rup\backslash_{simp} \, s$ unfolds to 
       
   372 $\simp((\simp(r\backslash a))\backslash a)$
       
   373 and $\simp(\rup\backslash s)$ unfolds to 
       
   374 $\simp((r\backslash a)\backslash a)$. The one simplification
       
   375 in the latter causes the resulting regular expression to 
       
   376 become $1+a^*$, exactly the same as the former with
       
   377 two simplifications.
       
   378 
   362 Second, the bit-codes are different, but they are essentially
   379 Second, the bit-codes are different, but they are essentially
   363 the same: if we push the outmost bits ${\bf_0}(_0\ONE +_{11}a^*)$ of $\rup\backslash_{simp} \, s$
   380 the same: if we push the outmost bits ${\bf_0}(_0\ONE +_{11}a^*)$ of $\rup\backslash_{simp} \, s$
   364 inside then we get $(_{00}\ONE +_{011}a^*)$, exactly the 
   381 inside then we get $(_{00}\ONE +_{011}a^*)$, exactly the 
   365 same as that of $\rup\backslash \, s$. And this difference 
   382 same as that of $\rup\backslash \, s$. And this difference 
   366 does not matter when we try to apply $\bmkeps$ or $\retrieve$
   383 does not matter when we try to apply $\bmkeps$ or $\retrieve$
   367 to it.\\
   384 to it. This seems a good news if we want to use $\retrieve$
   368 If we look into the difference above, we could see why:
   385 to prove things.
   369 during the first derivative operation, 
   386 
   370 $\rup\backslash a=(_0\ONE  + \ZERO)(_0a  +  _1a^*)$ gets into a sequence
   387 If we look into the difference above, we could see that the
   371 with the first part being nullable, but not the second part.
   388 difference is not fundamental: the bits are just being moved
       
   389 around in a way that does not hurt the correctness.
       
   390 During the first derivative operation, 
       
   391 $\rup\backslash a=(_0\ONE  + \ZERO)(_0a  +  _1a^*)$  is
       
   392 in the form of a sequence regular expression with the first
       
   393 part being nullable. 
       
   394 Recall the simplification function definition:
       
   395 \begin{center}
       
   396   \begin{tabular}{@{}lcl@{}}
       
   397    
       
   398   $\textit{simp} \; (\textit{SEQ}\;bs\,a_1\,a_2)$ & $\dn$ & $ (\textit{simp} \; a_1, \textit{simp}  \; a_2) \; \textit{match} $ \\
       
   399    &&$\quad\textit{case} \; (\ZERO, \_) \Rightarrow  \ZERO$ \\
       
   400    &&$\quad\textit{case} \; (\_, \ZERO) \Rightarrow  \ZERO$ \\
       
   401    &&$\quad\textit{case} \;  (\ONE, a_2') \Rightarrow  \textit{fuse} \; bs \;  a_2'$ \\
       
   402    &&$\quad\textit{case} \; (a_1', \ONE) \Rightarrow  \textit{fuse} \; bs \;  a_1'$ \\
       
   403    &&$\quad\textit{case} \; (a_1', a_2') \Rightarrow  \textit{SEQ} \; bs \; a_1' \;  a_2'$ \\
       
   404 
       
   405   $\textit{simp} \; (\textit{ALTS}\;bs\,as)$ & $\dn$ & $\textit{distinct}( \textit{flatten} ( \textit{map simp as})) \; \textit{match} $ \\
       
   406   &&$\quad\textit{case} \; [] \Rightarrow  \ZERO$ \\
       
   407    &&$\quad\textit{case} \; a :: [] \Rightarrow  \textit{fuse bs a}$ \\
       
   408    &&$\quad\textit{case} \;  as' \Rightarrow  \textit{ALTS}\;bs\;as'$\\ 
       
   409 
       
   410    $\textit{simp} \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$   
       
   411 \end{tabular}    
       
   412 \end{center}    
       
   413 
       
   414 \noindent
       
   415 If we call $\simp$ on $\rup$, just as $\backslash_{simp}$
       
   416 requires, then we would go throught the third clause of 
       
   417 the sequence case:$\quad\textit{case} \;  (\ONE, a_2') \Rightarrow  \textit{fuse} \; bs \;  a_2'$.
       
   418 The $\ZERO$ of $(_0\ONE  + \ZERO)$ is simplified away and 
       
   419 $_0\ONE$ merged into $_0a  +  _1a^*$ by simply
       
   420 putting its bits($_0$) to the front of the second component:
       
   421  ${\bf_0}(_0a  +  _1a^*)$. 
       
   422  After a second derivative operation,
       
   423  namely, $(_0(_0a  +  _1a^*))\backslash a$, we get 
       
   424  $
       
   425  _0(_0 \ONE  +  _1(_1\ONE \cdot a^*))
       
   426  $, and this simplifies to $_0(_0 \ONE  +  _{11} a^*)$
       
   427  by the third clause of the alternative case:
       
   428  $\quad\textit{case} \;  as' \Rightarrow  \textit{ALTS}\;bs\;as'$.
       
   429 The outmost bit $_0$ remains unchanged and stays with 
       
   430 the outmost regular expression. However, things are a bit
       
   431 different when it comes to $\simp(\rup\backslash \, s)$, because
       
   432 without simplification,  first term of the sequence 
       
   433 $\rup\backslash a=(_0\ONE  + \ZERO)(_0a  +  _1a^*)$ 
       
   434 is not merged into the second component
       
   435 and is nullable.
       
   436 Therefore $((_0\ONE  + \ZERO)(_0a  +  _1a^*))\backslash a$ splits into
       
   437 $([(\ZERO + \ZERO)\cdot(_0a  +  _1a^*)] + _0( _0\ONE  + _1[_1\ONE \cdot a^*]))$.
       
   438 After these two successive derivatives without simplification,
       
   439 we apply $\simp$ to this regular expression, which goes through
       
   440 the alternative clause, and each component of $([(\ZERO + \ZERO)\cdot(_0a  +  _1a^*)] + _0( _0\ONE  + _1[_1\ONE \cdot a^*]))$ will be simplified, giving us the list:$[\ZERO, _0(_0\ONE  + _{11}a^*]$,this
       
   441 list is then flattened--for
       
   442 $([(\ZERO + \ZERO)\cdot(_0a  +  _1a^*)]$ it will be simplified into $\ZERO$
       
   443 and then thrown away by $\textit{flatten}$; $ _0( _0\ONE  + _1[_1\ONE \cdot a^*]))$
       
   444  becomes  $ _{00}\ONE  + _{011}a^*]))$ because flatten opens up the alternative
       
   445  $\ONE + a^*$ and fuses the front bit(s) $_0$ to the front of $_0\ONE $ and $_{11}a^*$
       
   446  and get $_{00}$ and $_{011}$.
       
   447  %CONSTRUCTION SITE
   372 and we can use this to construct a set of examples based 
   448 and we can use this to construct a set of examples based 
   373 on this type of behaviour of two operations.
   449 on this type of behaviour of two operations:
       
   450 $r_1$
   374 that is to say, despite the bits are being moved around on the regular expression
   451 that is to say, despite the bits are being moved around on the regular expression
   375 (difference in bits), the structure of the (unannotated)regular expression
   452 (difference in bits), the structure of the (unannotated)regular expression
   376 after one simplification is exactly the same after the 
   453 after one simplification is exactly the same after the 
   377 same sequence of derivative operations 
   454 same sequence of derivative operations 
   378 regardless of whether we did simplification
   455 regardless of whether we did simplification
   380 However, without erase the above equality does not hold:
   457 However, without erase the above equality does not hold:
   381 for the regular expression  
   458 for the regular expression  
   382 $(a+b)(a+a*)$,
   459 $(a+b)(a+a*)$,
   383 if we do derivative with respect to string $aa$,
   460 if we do derivative with respect to string $aa$,
   384 we get
   461 we get
   385 %TODO
   462 
   386 sdddddr does not equal sdsdsdsr sometimes.\\
   463 sdddddr does not equal sdsdsdsr sometimes.\\
   387 For example,
   464 For example,
   388 
   465 
   389 This equicalence class method might still have the potential of proving this,
   466 This equicalence class method might still have the potential of proving this,
   390 but not yet
   467 but not yet