etnms/etnms.tex
changeset 101 4a327e70d538
parent 100 397b31867ea6
child 102 9c3c118896bb
equal deleted inserted replaced
100:397b31867ea6 101:4a327e70d538
    31 \Copyright{Chengsong Tan}
    31 \Copyright{Chengsong Tan}
    32 
    32 
    33 \newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}%
    33 \newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}%
    34 \newcommand{\ZERO}{\mbox{\bf 0}}
    34 \newcommand{\ZERO}{\mbox{\bf 0}}
    35 \newcommand{\ONE}{\mbox{\bf 1}}
    35 \newcommand{\ONE}{\mbox{\bf 1}}
       
    36 \def\erase{\textit{erase}}
    36 \def\bders{\textit{bders}}
    37 \def\bders{\textit{bders}}
    37 \def\lexer{\mathit{lexer}}
    38 \def\lexer{\mathit{lexer}}
    38 \def\blexer{\textit{blexer}}
    39 \def\blexer{\textit{blexer}}
    39 \def\blexers{\mathit{blexer\_simp}}
    40 \def\blexers{\mathit{blexer\_simp}}
    40 \def\simp{\mathit{simp}}
    41 \def\simp{\mathit{simp}}
   295 we face the problem that in the above 
   296 we face the problem that in the above 
   296 equalities,
   297 equalities,
   297 $\retrieve \; a \; v$ is not always defined.
   298 $\retrieve \; a \; v$ is not always defined.
   298 for example,
   299 for example,
   299 $\retrieve \; _0(_1a+_0a) \; \Left(\Empty)$
   300 $\retrieve \; _0(_1a+_0a) \; \Left(\Empty)$
   300 is defined, but not $\retrieve \; _{01}a \;\Left(\Empty)$,
   301 is defined, but not $\retrieve \; (_{01}a) \;\Left(\Empty)$,
   301 though we can extract the same POSIX
   302 though we can extract the same POSIX
   302 bits from the two annotated regular expressions.
   303 bits from the two annotated regular expressions.
   303 The latter might occur when we try to retrieve from 
   304 The latter might occur when we try to retrieve from 
   304 a simplified regular expression using the same value
   305 a simplified regular expression using the same value
   305 as the unsimplified one.
   306 as the unsimplified one.
   306 This is because $\Left(\Empty)$ corresponds to
   307 This is because $\Left(\Empty)$ corresponds to
   307 the regular expression structure $\AALTS$ instead of
   308 the regular expression structure $\ONE+r_2$ instead of
   308 $\AONE$, this v
   309 $\ONE$.
   309 That means, if we 
   310 That means, if we 
   310 want to prove that 
   311 want to prove that 
   311 \begin{center}
   312 \begin{center}
   312 $\textit{decode} \; \bmkeps \; \rup\backslash s \; r = \textit{decode} \; \bmkeps \; \rup\backslash_{simp} s \; r$
   313 $\textit{decode} \; \bmkeps \; \rup\backslash s \; r = \textit{decode} \; \bmkeps \; \rup\backslash_{simp} s \; r$
   313 \end{center}
   314 \end{center}
   314 \noindent
   315 \noindent
   315 holds by using $\retrieve$,
   316 holds by using $\retrieve$,
   316 we probably need to prove an equality like below:
   317 we probably need to prove an equality like below:
   317 \begin{center}
   318 \begin{center}
   318 %$\retrieve \; \rup\backslash_{simp} s \; \mkeps(r\backslash_{simp} s)=\textit{retrieve} \; \rup\backslash s \; \mkeps(r\backslash s)$
   319 %$\retrieve \; \rup\backslash_{simp} s \; \mkeps(r\backslash_{simp} s)=\textit{retrieve} \; \rup\backslash s \; \mkeps(r\backslash s)$
   319 $\retrieve \; \rup\backslash_{simp} s \; \mkeps(r\backslash s)=\textit{retrieve} \; \rup\backslash s \; \mkeps(r\backslash s)$
   320 $\retrieve \; \rup\backslash_{simp} s \; \mkeps(f(r\backslash s))=\textit{retrieve} \; \rup\backslash s \; \mkeps(r\backslash s)$
   320 \end{center}
   321 \end{center}
   321 \noindent
   322 \noindent
   322 we would need to rectify the value $\mkeps(r\backslash s)$ into something simpler
   323 $f$ rectifies $r\backslash s$ so the value $\mkeps(f(r\backslash s))$ becomes
       
   324  something simpler
   323 to make the retrieve function defined.\\
   325 to make the retrieve function defined.\\
   324 One way to do this is to prove the following:
   326 One way to do this is to prove the following:
   325 \begin{center}
   327 \begin{center}
   326 $\retrieve \; \rup\backslash_{simp} s \; \mkeps(\simp(r\backslash s))=\textit{retrieve} \; \rup\backslash s \; \mkeps(r\backslash s)$
   328 $\retrieve \; \rup\backslash_{simp} s \; \mkeps(\simp(r\backslash s))=\textit{retrieve} \; \rup\backslash s \; \mkeps(r\backslash s)$
   327 \end{center}
   329 \end{center}
   328 \noindent
   330 \noindent
       
   331 The reason why we choose $\simp$ as $f$ is because
       
   332 $\rup\backslash_{simp} \, s$ and $\simp(\rup\backslash \, s)$
       
   333 have the same shape:
       
   334 \begin{center}
       
   335 $\erase (\rup\backslash_{simp} \, s) = \erase(\simp(\rup\backslash s))$
       
   336 \end{center}
       
   337 
       
   338 \noindent
       
   339 $\erase$ in the above equality means to remove the bit-codes
       
   340 in an annotated regular expression and only keep the original
       
   341 regular expression(just like "erasing" the bits). Its definition is omitted.
       
   342 $\rup\backslash_{simp} \, s$ and $\simp(\rup\backslash s)$
       
   343 are very closely related, but not identical.
       
   344 For example, let $r$ be the regular expression
       
   345 $(a+b)(a+a*)$ and $s$  be the string $aa$, then
       
   346 \begin{center}
       
   347 $\rup\backslash_{simp} \, s$ is equal to $_0(_0\ONE +_{11}a^*)$
       
   348 \end{center}
       
   349 \noindent
       
   350 whereas
       
   351 \begin{center}
       
   352 $\rup\backslash \, s$ is equal to $(_{00}\ONE +_{011}a^*)$
       
   353 \end{center}
       
   354 \noindent
       
   355 Two "rules" might be inferred from the above example.
       
   356 First, after erasing the bits the two regular expressions
       
   357 are exactly the same: both become $1+a^*$. Here the 
       
   358 function $\simp$ exhibits the "once equals many times"
       
   359 property: one simplification in the end causes the 
       
   360 same regular expression structure as
       
   361  successive simplifications done alongside derivatives.
       
   362 Second, the bit-codes are different, but they are essentially
       
   363 the same: if we push the outmost bits ${\bf_0}(_0\ONE +_{11}a^*)$ of $\rup\backslash_{simp} \, s$
       
   364 inside then we get $(_{00}\ONE +_{011}a^*)$, exactly the 
       
   365 same as that of $\rup\backslash \, s$. And this difference 
       
   366 does not matter when we try to apply $\bmkeps$ or $\retrieve$
       
   367 to it.\\
       
   368 If we look into the difference above, we could see why:
       
   369 
       
   370 that is to say, despite the bits are being moved around on the regular expression
       
   371 (difference in bits), the structure of the (unannotated)regular expression
       
   372 after one simplification is exactly the same after the 
       
   373 same sequence of derivative operations 
       
   374 regardless of whether we did simplification
       
   375 along the way.
       
   376 However, without erase the above equality does not hold:
       
   377 for the regular expression  
       
   378 $(a+b)(a+a*)$,
       
   379 if we do derivative with respect to string $aa$,
       
   380 we get
       
   381 %TODO
       
   382 sdddddr does not equal sdsdsdsr sometimes.\\
       
   383 For example,
       
   384 
       
   385 This equicalence class method might still have the potential of proving this,
       
   386 but not yet
       
   387 i parallelly tried another method of using retrieve\\
       
   388 
       
   389 
       
   390 
   329 %HERE CONSTRUCTION SITE
   391 %HERE CONSTRUCTION SITE
   330 The vsimp function, defined as follows
   392 The vsimp function, defined as follows
   331 tries to simplify the value in lockstep with 
   393 tries to simplify the value in lockstep with 
   332 regular expression:\\
   394 regular expression:\\
   333 
   395