diff -r 24d9d64c2a95 -r f575cf219377 ninems/ninems.tex --- a/ninems/ninems.tex Tue Jul 16 22:18:18 2019 +0100 +++ b/ninems/ninems.tex Tue Jul 16 22:20:11 2019 +0100 @@ -950,31 +950,25 @@ \end{tabular} \end{center} %\end{definition} -For instance, we attach an additional bit $\Z$ to - the front of $r \backslash c$ to indicate that there - is one more star iteration when we unfold $\textit{STAR} \; bs \; a$ -into a sequence in the last clause. -A more subtle treatment of the bits lies in the $\textit{SEQ}$ clause - when $a_1$ is $\textit{bnullable}$($\textit{bnullable}$ - is exactly the same as $\textit{nullable}$, except that it is for - annotated regular expressions, therefore we omit the definition). -$\textit{bmkeps} \, a_1$ extracts the bitcode accumulated in -$a_1$ for how the prefix $s_1$ of the input string ending at character $c$ -is matched by the predecessor of $a_1$(before $s_1$ was chopped off). -Then $\textit{fuse}$ stores the information -extracted by $\textit{bmkeps}$ at the head of the -regular expression $a_2 \backslash c$. -The bitsequence $bs$, which was initially -attached to the head of $\textit{SEQ}$, has -now been elevated to the top-level $\textit{ALT}$, -as this information will be needed -whichever way the $\textit{SEQ}\,bs \,a_1\,a_2$ is -matched--no matter whether $c$ belongs to $a_1$ or $ a_2$. -After carefully doing these derivatives and -maintaining all the parsing information, -we complete the parsing by collecting the bits using -a special $\textit{mkeps}$ function for annotated -regular expressions--$\textit{bmkeps}$: + +For instance, when we unfold $STAR \; bs \; a$ into a sequence, we +attach an additional bit Z to the front of $r \backslash c$ to indicate +that there is one more star iteration. The other example, the $SEQ$ +clause is more subtle-- when $a_1$ is $bnullable$(here bnullable is +exactly the same as nullable, except that it is for annotated regular +expressions, therefore we omit the definition). Assume that $bmkeps$ +correctly extracts the bitcode for how $a_1$ matches the string prior to +character c(more on this later), then the right branch of $ALTS$, which +is $fuse \; bmkeps \; a_1 (a_2 \backslash c)$ will collapse the regular +expression $a_1$(as it has already been fully matched) and store the +parsing information at the head of the regular expression $a_2 +\backslash c$ by fusing to it. The bitsequence $bs$, which was initially +attached to the head of $SEQ$, has now been elevated to the top-level of +ALT, as this information will be needed whichever way the $SEQ$ is +matched--no matter whether c belongs to $a_1$ or $ a_2$. After carefully +doing these derivatives and maintaining all the parsing information, we +complete the parsing by collecting the bits using a special $mkeps$ +function for annotated regular expressions--$bmkeps$: %\begin{definition}[\textit{bmkeps}]\mbox{} @@ -1131,21 +1125,25 @@ \end{quote} \noindent -We would settle the correctness claim. -It is relatively straightforward to establish that after one simplification step, -the part of a nullable derivative that corresponds to a POSIX value -remains intact and can still be collected, -in other words, +We would settle the correctness claim. It is relatively straightforward +to establish that after one simplification step, the part of a nullable +derivative that corresponds to a POSIX value remains intact and can +still be collected, in other words, + \begin{center} $\textit{bmkeps} \; r = \textit{bmkeps} \; \textit{simp} \; r\;( r\; \textit{nullable})$ \end{center} -as this basically comes down to proving actions like removing -the additional $r$ in $r+r$ does not delete important POSIX information -in a regular expression. -The hardcore of this problem is to prove that + +\noindent +as this basically comes down to proving actions like removing the +additional $r$ in $r+r$ does not delete important POSIX information in +a regular expression. The hardcore of this problem is to prove that + \begin{center} $\textit{bmkeps} \; \textit{blexer}\_{simp} \; r = \textit{bmkeps} \; \textit{blexer} \; \textit{simp} \; r$ \end{center} + +\noindent That is, if we do derivative on regular expression r and the simplified version, they can still provide the same POSIX value if there is one . This is not as straightforward as the previous proposition, as the two regular expressions $r$ and $\textit{simp}\; r$