ninems/ninems.tex
changeset 84 de50a65d1b15
parent 83 8c1195dd6136
child 85 ba40ab3658ca
equal deleted inserted replaced
83:8c1195dd6136 84:de50a65d1b15
  1204 %the current derivative matches the suffix of the string(the characters that
  1204 %the current derivative matches the suffix of the string(the characters that
  1205 %have not yet appeared, but will appear as the successive derivatives go on.
  1205 %have not yet appeared, but will appear as the successive derivatives go on.
  1206 %How do we get this "future" information? By the value $v$, which is
  1206 %How do we get this "future" information? By the value $v$, which is
  1207 %computed by a pass of the algorithm that uses
  1207 %computed by a pass of the algorithm that uses
  1208 %$inj$ as described in the previous section).  
  1208 %$inj$ as described in the previous section).  
  1209 using information from both the derivative regular expression and the value.
  1209 using information from both the derivative regular expression and the
  1210 Sulzmann and Lu used this
  1210 value. Sulzmann and Lu poroposed this function, but did not prove
  1211 to connect the bitcoded algorithm to the older algorithm by the following
  1211 anything about it. Ausaf and Urban used it to connect the bitcoded
  1212 equation:
  1212 algorithm to the older algorithm by the following equation:
  1213  
  1213  
  1214  \begin{center} $inj \;a\; c \; v = \textit{decode} \; (\textit{retrieve}\;
  1214  \begin{center} $inj \;a\; c \; v = \textit{decode} \; (\textit{retrieve}\;
  1215 	 ((\textit{internalise}\; r)\backslash_{simp} c) v)$ 
  1215 	 (r^\uparrow)\backslash_{simp} \,c)\,v)$ 
  1216  \end{center} 
  1216  \end{center} 
  1217  A little fact that needs to be stated to aid comprehension:
  1217 
  1218 \begin{center} 
  1218 \noindent
  1219 	$r^\uparrow = a$($a$ stands for $\textit{annotated}).$
  1219 whereby $r^\uparrow$ stands for the internalised version of $r$. Ausaf
  1220 \end{center} 
  1220 and Urban also used this fact to prove  the correctness of bitcoded
  1221 Ausaf and Urban also used this fact to prove  the
  1221 algorithm without simplification.  Our purpose of using this, however,
  1222 correctness of bitcoded algorithm without simplification.  Our purpose
  1222 is to establish 
  1223 of using this, however, is to establish 
  1223 
  1224 \begin{center}
  1224 \begin{center}
  1225 $ \textit{retrieve} \;
  1225 $ \textit{retrieve} \;
  1226 a \; v \;=\; \textit{retrieve}  \; \textit{simp}(a) \; v'.$
  1226 a \; v \;=\; \textit{retrieve}  \; (\textit{simp}\,a) \; v'.$
  1227 \end{center}
  1227 \end{center}
  1228 The idea
  1228 The idea is that using $v'$, a simplified version of $v$ that had gone
  1229 is that using $v'$, a simplified version of $v$ that had gone
  1229 through the same simplification step as $\textit{simp}(a)$, we are able
  1230 through the same simplification step as $\textit{simp}(a)$, we are
  1230 to extract the bitcode that gives the same parsing information as the
  1231 able to extract the bit-sequence that gives the same parsing
  1231 unsimplified one. However, we noticed that constructing such a  $v'$
  1232 information as the unsimplified one.  After establishing this, we
  1232 from $v$ is not so straightforward. The point of this is that  we might
  1233 might be able to finally bridge the gap of proving
  1233 be able to finally bridge the gap by proving
  1234 \begin{center}
  1234 
  1235 $\textit{retrieve} \; r^\uparrow   \backslash  s \; v = \;\textit{retrieve} \;
  1235 \begin{center}
  1236 \textit{bsimp}(r^\uparrow)  \backslash  s \; v'$
  1236 $\textit{retrieve} \; (r^\uparrow   \backslash  s) \; v = \;\textit{retrieve} \;
  1237 \end{center}
  1237 \textit{simp}(r^\uparrow)  \backslash  s \; v'$
       
  1238 \end{center}
       
  1239 
       
  1240 \noindent
  1238 and subsequently
  1241 and subsequently
  1239 \begin{center}
  1242 
  1240 $\textit{retrieve} \; r^\uparrow \backslash  s \; v\; = \; \textit{retrieve} \;
  1243 \begin{center}
  1241 r^\uparrow  \backslash_{simp}   s \; v'$.
  1244 $\textit{retrieve} \; (r^\uparrow \backslash  s) \; v\; = \; \textit{retrieve} \;
  1242 \end{center}
  1245 (r^\uparrow  \backslash_{simp}  \, s) \; v'$.
  1243 The $\textit{LHS}$ of the above equation is the bitcode we want.
  1246 \end{center}
  1244 This proves that our simplified
  1247 
  1245 version of regular expression still contains all the bitcodes needed.
  1248 \noindent
  1246 The task here is to find a way to compute the correct $v'$.
  1249 The $\textit{LHS}$ of the above equation is the bitcode we want. This
       
  1250 would prove that our simplified version of regular expression still
       
  1251 contains all the bitcodes needed. The task here is to find a way to
       
  1252 compute the correct $v'$.
  1247 
  1253 
  1248 The second task is to speed up the more aggressive simplification.
  1254 The second task is to speed up the more aggressive simplification.
  1249 Currently it is slower than a naive simplification(the naive version as
  1255 Currently it is slower than the original naive simplification by Ausaf
  1250 implemented in ADU of course can explode in some cases). So it needs to
  1256 and Urban (the naive version as implemented by Ausaf   and Urban of
  1251 be explored how to make it faster. Our possibility would be to explore
  1257 course can ``explode'' in some cases). So it needs to be explored how to
  1252 again the connection to DFAs. This is very much work in progress.
  1258 make our algorithm faster on all inputs. Our possibility would be to
       
  1259 explore again the connection to DFAs. This is very much work in
       
  1260 progress.
  1253 
  1261 
  1254 \section{Conclusion}
  1262 \section{Conclusion}
  1255 
  1263 
  1256 In this PhD-project we are interested in fast algorithms for regular
  1264 In this PhD-project we are interested in fast algorithms for regular
  1257 expression matching. While this seems to be a ``settled'' area, in
  1265 expression matching. While this seems to be a ``settled'' area, in