ninems/ninems.tex
changeset 62 5b10d83b0834
parent 61 580c7b84f900
child 63 d3c22f809dde
equal deleted inserted replaced
61:580c7b84f900 62:5b10d83b0834
   516 To understands this definition in a more vivid way, the reader 
   516 To understands this definition in a more vivid way, the reader 
   517 might imagine this: when we do the derivative on regular expression 
   517 might imagine this: when we do the derivative on regular expression 
   518 $r_{i-1}$, we chop off a character from $r_{i-1}$ to form $r_i$.
   518 $r_{i-1}$, we chop off a character from $r_{i-1}$ to form $r_i$.
   519 This leaves a "hole" on $r_i$. In order to calculate a value for
   519 This leaves a "hole" on $r_i$. In order to calculate a value for
   520 $r_{i-1}$, we need to locate where that hole is. We can find this location
   520 $r_{i-1}$, we need to locate where that hole is. We can find this location
   521 informtaion by comparing $r_{i-1}$ and $v_i$.
   521 informtation by comparing $r_{i-1}$ and $v_i$.
   522 For instance, if $r_{i-1}$ is of shape $r_a \cdot r_b$, and $v_i$
   522 For instance, if $r_{i-1}$ is of shape $r_a \cdot r_b$, and $v_i$
   523 is of shape $\textit{Left}(\textit{Seq}(v_a, v_b))$, we know immediately that 
   523 is of shape $\textit{Left}(\textit{Seq}(v_a, v_b))$, we know immediately that 
   524 \[ (r_a \cdot r_b)\backslash c = (r_a\backslash c) \cdot r_b \,+\, r_b\backslash c\],
   524 \[ (r_a \cdot r_b)\backslash c = (r_a\backslash c) \cdot r_b \,+\, r_b\backslash c\],
   525 otherwise if $r_a$ is not nullable,
   525 otherwise if $r_a$ is not nullable,
   526 \[ (r_a \cdot r_b)\backslash c = (r_a\backslash c) \cdot r_b\],
   526 \[ (r_a \cdot r_b)\backslash c = (r_a\backslash c) \cdot r_b\],
   566 The outer 2 $Left$ tells us the first nullable part  hides in the term
   566 The outer 2 $Left$ tells us the first nullable part  hides in the term
   567 \[((0+0+0 + 0 + 1 \cdot 1 \cdot 1) \cdot r^* \]
   567 \[((0+0+0 + 0 + 1 \cdot 1 \cdot 1) \cdot r^* \]
   568  in \[r_3 = ( \underline{(0+0+0 + 0 + 1 \cdot 1 \cdot 1) \cdot r^*} + (0+0+0  + 1 + 0)
   568  in \[r_3 = ( \underline{(0+0+0 + 0 + 1 \cdot 1 \cdot 1) \cdot r^*} + (0+0+0  + 1 + 0)
   569   \cdot r^*) +((0+1+0  + 0 + 0) \cdot r^*+(0+0+0  + 1 + 0) \cdot r^* )
   569   \cdot r^*) +((0+1+0  + 0 + 0) \cdot r^*+(0+0+0  + 1 + 0) \cdot r^* )
   570  \](underlined). Note that the leftmost location of term \[((0+0+0 + 0 + 1 \cdot 1 \cdot 1) \cdot r^* \]
   570  \](underlined). Note that the leftmost location of term \[((0+0+0 + 0 + 1 \cdot 1 \cdot 1) \cdot r^* \]
   571  (which corresponds to the intial sub-match $abc$)
   571  (which corresponds to the initial sub-match $abc$)
   572   allows $mkeps$ to choose  it as the first candidate that meets the requirement of being $nullable$
   572   allows $mkeps$ to choose  it as the first candidate that meets the requirement of being $nullable$
   573   because $mkeps$ is defined to always choose the left one when nullable.
   573   because $mkeps$ is defined to always choose the left one when nullable.
   574   In the case of this example, $abc$ is preferred over $a$ or $ab$.
   574   In the case of this example, $abc$ is preferred over $a$ or $ab$.
   575    This location is naturally generated by the splitting clause\[
   575    This location is naturally generated by the splitting clause\[
   576      (r_1 \cdot r_2)\backslash c  (when \; r_1 \; nullable)) \, = (r_1\backslash c) \cdot r_2 \,+\, r_2\backslash c\].
   576      (r_1 \cdot r_2)\backslash c  (when \; r_1 \; nullable)) \, = (r_1\backslash c) \cdot r_2 \,+\, r_2\backslash c\].
   957 It is relatively straightforward to establish that after 1 simplification step, the part of derivative that corresponds to a POSIX value remains intact and can still be collected, in other words,
   957 It is relatively straightforward to establish that after 1 simplification step, the part of derivative that corresponds to a POSIX value remains intact and can still be collected, in other words,
   958 bmkeps r = bmkeps simp r
   958 bmkeps r = bmkeps simp r
   959 as this basically comes down to proving actions like removing the additional $r$ in $r+r$  does not delete important POSIX information in a regular expression.
   959 as this basically comes down to proving actions like removing the additional $r$ in $r+r$  does not delete important POSIX information in a regular expression.
   960 The hardcore of this problem is to prove that
   960 The hardcore of this problem is to prove that
   961 bmkeps bders r = bmkeps bders simp r
   961 bmkeps bders r = bmkeps bders simp r
   962 That is, if we do derivative on regular expression r and the simplified version for, they can still prove the same POSIX value if there is one . This is not as straghtforward as the previous proposition, as the two regular expression r and simp r  might become very different regular expressions after repeated application ofd simp and derivative.
   962 That is, if we do derivative on regular expression r and the simplified version for, they can still prove the same POSIX value if there is one . This is not as straightforward as the previous proposition, as the two regular expression r and simp r  might become very different regular expressions after repeated application ofd simp and derivative.
   963 The crucial point is to find the "gene" of a regular expression and how it is kept intact during simplification.
   963 The crucial point is to find the "gene" of a regular expression and how it is kept intact during simplification.
   964 To aid this, we are utilizing the helping function retrieve described by Sulzmann and Lu:
   964 To aid this, we are utilizing the helping function retrieve described by Sulzmann and Lu:
   965 \\definition of retrieve\\
   965 \\definition of retrieve\\
   966  This function assembled the bitcode that corresponds to a parse tree for how the current derivative matches the suffix of the string(the characters that have not yet appeared, but is stored in the value).
   966  This function assembled the bitcode that corresponds to a parse tree for how the current derivative matches the suffix of the string(the characters that have not yet appeared, but is stored in the value).
   967  Sulzmann and Lu used this to connect the bit-coded algorithm to the older algorithm by the following equation:\\
   967  Sulzmann and Lu used this to connect the bit-coded algorithm to the older algorithm by the following equation:\\