# HG changeset patch # User Chengsong # Date 1563832860 -3600 # Node ID 481c8000de6dcc4f91fda7b089293402506cde1a # Parent a67aff8fb06ac160c7bd17cba3c80518ed3a726f works on the part that christian did not read will continue tomorrow morning diff -r a67aff8fb06a -r 481c8000de6d ninems/ninems.tex --- a/ninems/ninems.tex Mon Jul 22 22:30:47 2019 +0100 +++ b/ninems/ninems.tex Mon Jul 22 23:01:00 2019 +0100 @@ -1138,11 +1138,11 @@ have to work out all proof details.'' \end{quote} -\noindent -We would settle this correctness claim. It is relatively straightforward -to establish that after one simplification step, the part of a nullable -derivative that corresponds to a POSIX value remains intact and can -still be collected, in other words, we can show that\comment{Double-check....I think this is not the case} +\noindent We would settle this correctness claim. It is relatively +straightforward to establish that after one simplification step, the part of a +nullable derivative that corresponds to a POSIX value remains intact and can +still be collected, in other words, we can show that\comment{Double-check....I +think this is not the case} \begin{center} $\textit{bmkeps} \; r = \textit{bmkeps} \; \textit{simp} \; r\;( r\; \textit{nullable})$ @@ -1154,19 +1154,27 @@ a regular expression. The hard part of this proof is to establish that \begin{center} -$\textit{bmkeps} \; \textit{blexer}\_{simp} \; r = \textit{bmkeps} \; \textit{blexer} \; \textit{simp} \; r$ + $\textit{bmkeps} \; \textit{blexer}\_{simp}(s, \; r) = \textit{bmkeps} \; \textit{blexer} \; \textit{simp}(s, \; r)$ \end{center} -\noindent\comment{OK from here on you still need to work. Did not read.} -That is, if we do derivative on regular expression r and the simplified version, -they can still provide the same POSIX value if there is one . -This is not as straightforward as the previous proposition, as the two regular expressions $r$ and $\textit{simp}\; r$ - might become very different regular expressions after repeated application of $\textit{simp}$ and derivative. -The crucial point is to find the indispensable information of -a regular expression and how it is kept intact during simplification so that it performs -as good as a regular expression that has not been simplified in the subsequent derivative operations. -To aid this, we use the helping function retrieve described by Sulzmann and Lu: -\\definition of retrieve\\ +\noindent That is, if we do derivative on regular expression $r$ and then +simplify it, and repeat this process until we exhaust the string, we get a +regular expression $r''$ that provides the POSIX matching as the result $r'$ of +the normal derivative algorithm which only does derivative operation repeatedly +and has no simplification at all. This might seem at first glance very +unintuitive, as the $r'$ is exponentially larger than $r''$. But this can be +explained in the following way: we are pruning away the possible matches that +are not POSIX. Since there are exponentially non-POSIX matchings and only 1 +POSIX matching, it is understandable that our $r''$ can be a lot smaller. we +can still provide the same POSIX value if there is one. This is not as +straightforward as the previous proposition, as the two regular expressions $r$ +and $\textit{simp}\; r$ might become very different regular expressions after +repeated application of $\textit{simp}$ and derivative. The crucial point is +to find the indispensable information of a regular expression and how it is +kept intact during simplification so that it performs as good as a regular +expression that has not been simplified in the subsequent derivative +operations. To aid this, we use the helping function retrieve described by +Sulzmann and Lu: \\definition of retrieve\\ This function assembled the bitcode that corresponds to a parse tree for how the current derivative matches the suffix of the string(the @@ -1174,23 +1182,24 @@ Sulzmann and Lu used this to connect the bitcoded algorithm to the older algorithm by the following equation: - \begin{center} - $inj \;a\; c \; v = \textit{decode} \; (\textit{retrieve}\; ((\textit{internalise}\; r)\backslash_{simp} c) v)$ - \end{center} - A little fact that needs to be stated to help comprehension: - \begin{center} - $r^\uparrow = a$($a$ stands for $\textit{annotated}).$ - \end{center} - Ausaf and Urban also used this fact to prove the correctness of bitcoded algorithm without simplification. - Our purpose of using this, however, is try to establish \\ -$ \textit{retrieve} \; a \; v \;=\; \textit{retrieve} \; \textit{simp}(a) \; v'.$\\ - The idea is that using $v'$, - a simplified version of $v$ that possibly had gone through the same simplification step as $\textit{simp}(a)$ we are still able to extract the bit-sequence that gives the same parsing information as the unsimplified one. - After establishing this, we might be able to finally bridge the gap of proving\\ - $\textit{retrieve} \; r \backslash s \; v = \;\textit{retrieve} \; \textit{simp}(r) \backslash s \; v'$\\ - and subsequently\\ - $\textit{retrieve} \; r \backslash s \; v\; = \; \textit{retrieve} \; r \backslash_{simp} s \; v'$.\\ - This proves that our simplified version of regular expression still contains all the bitcodes needed. + \begin{center} $inj \;a\; c \; v = \textit{decode} \; (\textit{retrieve}\; + ((\textit{internalise}\; r)\backslash_{simp} c) v)$ \end{center} A + little fact that needs to be stated to help comprehension: + \begin{center} $r^\uparrow = a$($a$ stands for $\textit{annotated}).$ + \end{center} Ausaf and Urban also used this fact to prove the + correctness of bitcoded algorithm without simplification. Our purpose + of using this, however, is try to establish \\ $ \textit{retrieve} \; + a \; v \;=\; \textit{retrieve} \; \textit{simp}(a) \; v'.$\\ The idea + is that using $v'$, a simplified version of $v$ that possibly had gone + through the same simplification step as $\textit{simp}(a)$ we are + still able to extract the bit-sequence that gives the same parsing + information as the unsimplified one. After establishing this, we + might be able to finally bridge the gap of proving\\ + $\textit{retrieve} \; r \backslash s \; v = \;\textit{retrieve} \; + \textit{simp}(r) \backslash s \; v'$\\ and subsequently\\ + $\textit{retrieve} \; r \backslash s \; v\; = \; \textit{retrieve} \; + r \backslash_{simp} s \; v'$.\\ This proves that our simplified + version of regular expression still contains all the bitcodes needed. The second task is to speed up the more aggressive simplification.