etnms/etnms.tex
changeset 111 af2c63f9bdf9
parent 110 a85c0f0fcf44
child 112 c19f2d20d92c
equal deleted inserted replaced
110:a85c0f0fcf44 111:af2c63f9bdf9
  1206 $\fuse$s the bit(s) $_0$ to the front of $_0\ONE $ and $_{11}a^*$.
  1206 $\fuse$s the bit(s) $_0$ to the front of $_0\ONE $ and $_{11}a^*$.
  1207 In a nutshell, the order of simplification causes
  1207 In a nutshell, the order of simplification causes
  1208 the bits to be moved differently.
  1208 the bits to be moved differently.
  1209  
  1209  
  1210  \subsubsection{A Failed Attempt To Remedy the Problem Above}
  1210  \subsubsection{A Failed Attempt To Remedy the Problem Above}
  1211 A simple class of regular expressions and strings 
  1211 A simple class of regular expression and string
  1212 pairs can be deduced 
  1212 pairs $(r, s)$ can be deduced from the above example 
  1213 from the above example which 
  1213 which trigger the difference between 
  1214 trigger the difference between 
       
  1215 $\rup\backslash_{simp} \, s$
  1214 $\rup\backslash_{simp} \, s$
  1216 and  $\simp(\rup\backslash s)$:
  1215 and  $\simp(\rup\backslash s)$:
  1217 \[
  1216 \begin{center}
  1218 D = (c_1c_2, \{r_1\cdot r_2 \mid \text{$\simp(r_2) = r_2, \simp(r_1 \backslash c_1) = \ONE, r_1 \; not \; \nullable, c_2 \in L(r_2),
  1217 \begin{tabular}{lcl}
  1219 \simp(r_2 \backslash c_2)$ is of shape alternative  and $c_1c_2 \notin L(r_1)$}\})
  1218 $D =\{ (r_1 \cdot r_2,\; [c_1c_2]) \mid $ & $\simp(r_2) = r_2, \simp(r_1 \backslash c_1) = \ONE,$\\
  1220 \]
  1219  $r_1 \; not \; \nullable, c_2 \in L(r_2),$ & $\exists \textit{rs},\textit{bs}.\;  r_2 \backslash c_2 = _{bs}{\oplus rs}$\\
  1221 The culprit is the different order in which simplification is applied.
  1220 $\exists \textit{rs}_1. \; \simp(r_2 \backslash c_2) = _{bs}{\oplus \textit{rs}_1}$ &  $and \;\simp(r_1 \backslash [c_1c_2]) = \ZERO\}$\\
  1222 For $\rup \backslash_{simp} s$,
  1221 \end{tabular}
       
  1222 \end{center}
       
  1223 We take a pair $(r, \;s)$ from the set $D$.
       
  1224 
       
  1225 Now we compute ${\bf \rup \backslash_{simp} s}$, we get:
  1223 \begin{center}
  1226 \begin{center}
  1224 \begin{tabular}{lcl}
  1227 \begin{tabular}{lcl}
  1225 $(r_1\cdot r_2)\backslash_{simp} \, [c_1c_2]$ & $= \simp\left[ \big(\simp\left[ \left( r_1\cdot r_2 \right) \backslash c_1\right] \big)\backslash c_2\right]$\\
  1228 $(r_1\cdot r_2)\backslash_{simp} \, [c_1c_2]$ & $= \simp\left[ \big(\simp\left[ \left( r_1\cdot r_2 \right) \backslash c_1\right] \big)\backslash c_2\right]$\\
  1226 								      & $= \simp\left[ \big(\simp \left[  \left(r_1 \backslash c_1\right) \cdot r_2 \right] \big) \backslash c_2 \right]$\\
  1229 								      & $= \simp\left[ \big(\simp \left[  \left(r_1 \backslash c_1\right) \cdot r_2 \right] \big) \backslash c_2 \right]$\\
  1227 								      & $= \simp \left[  (\fuse \; \bmkeps(r_1\backslash c_1) \; \simp(r_2) ) \backslash c_2 \right]$,
  1230 								      & $= \simp \left[  (\fuse \; \bmkeps(r_1\backslash c_1) \; \simp(r_2) ) \backslash c_2 \right]$,\\
       
  1231 								      & $= \simp \left[  (\fuse \; \bmkeps(r_1\backslash c_1) \; r_2 ) \backslash c_2 \right]$,
  1228 \end{tabular}
  1232 \end{tabular}
  1229 \end{center}
  1233 \end{center}
  1230 \noindent
  1234 \noindent
  1231 from the definition of $D$ we know $r_1 \backslash c_1$ is nullable, therefore
  1235 from the definition of $D$ we know $r_1 \backslash c_1$ is nullable, therefore
  1232 $\bmkeps(r_1\backslash c_1)$  returns a bitcode, we shall call it
  1236 $\bmkeps(r_1\backslash c_1)$  returns a bitcode, we shall call it
  1233  $bs$. Also, $\simp(r_2 \backslash c_2)$ is a certain alternative,
  1237  $\textit{bs}_2$. 
  1234  we shall call it $_{bs_1}\oplus rs$, and $( \left(r_1 \backslash c_1\right) \backslash c_2  \cdot r_2)$
  1238 The above term can be rewritten as
  1235  simplifies to $\ZERO$,
  1239 \begin{center}
  1236  so the above term can be rewritten as
  1240 $ \simp \left[  \fuse \; \textit{bs}_2\; r_2  \backslash c_2 \right]$,
  1237 \begin{center}
  1241 \end{center}
  1238 \begin{tabular}{lcl}
  1242 which is equal to 
  1239 $\textit{distinct}(\flatten[\ZERO\;, \; _{\textit{bs++bs1}}\oplus rs] ) \; \textit{match} $ \\
  1243 \begin{center}
       
  1244 $\simp \left[ \fuse \; \textit{bs}_2 \; _{bs}{\oplus rs} \right]$\\
       
  1245 $=\simp \left[  \; _{bs_2++bs}{\oplus rs} \right]$\\
       
  1246 $=  \; _{bs_2++bs}{\oplus \textit{rs}_1} $
       
  1247 \end{center}
       
  1248 \noindent
       
  1249 by using the properties from the set $D$ again
       
  1250 and again(The reason why we set so many conditions 
       
  1251 that the pair $(r,s)$ need to satisfy is because we can
       
  1252 rewrite them easily to construct the difference.)
       
  1253 
       
  1254 Now we compute ${\bf \simp(\rup \backslash s)}$:
       
  1255 \begin{center}
       
  1256 $\simp \big[(r_1\cdot r_2) \backslash [c_1c_2] \big]= \simp \left[ ((r_1 \cdot r_2 )\backslash c_1) \backslash c_2 \right]$
       
  1257 \end{center}
       
  1258 \noindent
       
  1259 Again, using the properties above, we obtain
       
  1260 the following chain of equalities:
       
  1261 \begin{center}
       
  1262 $\simp(\rup \backslash s)= \simp \left[ ((r_1 \cdot r_2 )\backslash c_1) \backslash c_2 \right]= \simp\left[    \left(r_1 \backslash c_1\right) \cdot r_2  \big) \backslash c_2 \right]$\\
       
  1263 $= \simp \left[ \oplus[\big( \left(r_1 \backslash c_1\right) \backslash c_2 \big) \cdot r_2 \; , \; \fuse \; (\bmkeps \;r_1\backslash c_1) \; r_2 \backslash c_2 ] \right]$,
       
  1264 \end{center}
       
  1265 \noindent
       
  1266 as before, we call the bitcode returned by 
       
  1267 $\bmkeps(r_1\backslash c_1)$ as
       
  1268 $\textit{bs}_2$. 
       
  1269 Also, $\simp(r_2 \backslash c_2)$ is 
       
  1270 $_{bs}\oplus \textit{rs}_1$, 
       
  1271 and $( \left(r_1 \backslash c_1\right) \backslash c_2  \cdot r_2)$
       
  1272 simplifies to $\ZERO$,
       
  1273 so the above term can be expanded as
       
  1274 \begin{center}
       
  1275 \begin{tabular}{l}
       
  1276 $\textit{distinct}(\flatten[\ZERO\;, \; _{\textit{bs}_2++\textit{bs}}\oplus \textit{rs}_1] ) \; \textit{match} $ \\
  1240   $\textit{case} \; [] \Rightarrow  \ZERO$ \\
  1277   $\textit{case} \; [] \Rightarrow  \ZERO$ \\
  1241    $\textit{case} \; a :: [] \Rightarrow  \textit{fuse bs a}$ \\
  1278    $\textit{case} \; a :: [] \Rightarrow  \textit{\fuse \; \textit{bs} a}$ \\
  1242     $\textit{case} \;  as' \Rightarrow  \textit{ALTS}\;bs\;as'$\\ 
  1279     $\textit{case} \;  as' \Rightarrow  _{[]}\oplus as'$\\ 
  1243 \end{tabular}
  1280 \end{tabular}
  1244 \end{center}
  1281 \end{center}
  1245 this, in turn, can be rewritten as $map (\fuse \textit{bs}++\textit{bs1}) rs$ based on 
  1282 \noindent
  1246 the properties of the function $\simp$(more on this later).
  1283 Applying the definition of $\flatten$, we get
  1247 simplification is done along the way, disallowing the structure of nested alternatives,
  1284 \begin{center}
  1248 and this can actually be proven: simp(r) does not contain
  1285 $_{[]}\oplus (\textit{map} \; \fuse (\textit{bs}_2 ++ bs) rs_1)$
  1249 nested alternatives.
  1286 \end{center}
  1250 \[
  1287 \noindent
  1251 \simp \left[(r_1\cdot r_2) \backslash [c_1c_2] \right]= \simp \left[ ((r_1 \cdot r_2 )\backslash c_1) \backslash c_2 \right]
  1288 compared to the result 
  1252 \]
  1289 \begin{center}
  1253 For $\simp(\rup \backslash s)$,
  1290 $ \; _{bs_2++bs}{\oplus \textit{rs}_1} $
  1254 \begin{center}
  1291 \end{center}
  1255 \begin{tabular}{lcl}
  1292 \noindent
  1256 $\simp\big[(r_1\cdot r_2)\backslash \, [c_1c_2]\big]$ & $= \simp\left[ \big( \left( r_1\cdot r_2 \right) \backslash c_1 \big)\backslash c_2\right]$\\
  1293 Note how these two regular expressions only
  1257 								      & $= \simp\left[    \left(r_1 \backslash c_1\right) \cdot r_2  \big) \backslash c_2 \right]$\\
  1294 differ in terms of the position of the bits 
  1258 								      & $= \simp \left[ \oplus[\big( \left(r_1 \backslash c_1\right) \backslash c_2 \big) \cdot r_2 \; , \; \fuse \; \bmkeps(r_1\backslash c_1) \; r_2 \backslash c_2 ] \right]$,
  1295 $\textit{bs}_2++\textit{bs}$. They are the same otherwise.
  1259 \end{tabular}
  1296 What caused this difference to happen?
  1260 \end{center}
  1297 The culprit is the $\flatten$ function, which spills
  1261 \noindent
  1298 out the bitcodes in the inner alternatives when 
  1262 as before, we call the bitcode returned by $\bmkeps(r_1\backslash c_1)$
  1299 there exists an outer alternative.
  1263  $bs$. Also, $\simp(r_2 \backslash c_2)$ is a certain alternative,
  1300 Note how the absence of simplification
  1264  we shall call it $_{bs_1}\oplus rs$, and $( \left(r_1 \backslash c_1\right) \backslash c_2  \cdot r_2)$
  1301 caused $\simp(\rup \backslash s)$ to
  1265  simplifies to $\ZERO$,
  1302 generate the nested alternatives structure:
  1266  so the above term can be rewritten as
  1303 \begin{center}
  1267 \begin{center}
  1304 $  \oplus[\ZERO \;, \; _{bs}\oplus \textit{rs} ]$
  1268 \begin{tabular}{lcl}
  1305 \end{center}
  1269 $\textit{distinct}(\flatten[\ZERO\;, \; _{\textit{bs}++\textit{bs1}}\oplus rs] ) \; \textit{match} $ \\
  1306 and this will always trigger the $\flatten$ to 
  1270   $\textit{case} \; [] \Rightarrow  \ZERO$ \\
  1307 spill out the sequence $\textit{bs}$,
  1271    $\textit{case} \; a :: [] \Rightarrow  \textit{fuse bs a}$ \\
  1308 whereas when
  1272     $\textit{case} \;  as' \Rightarrow  \textit{ALTS}\;bs\;as'$\\ 
  1309 simplification is done along the way, 
  1273 \end{tabular}
  1310 the structure of nested alternatives is never created(we can
  1274 \end{center}
  1311 actually prove that simplification function never allows nested
  1275 
  1312 alternatives to happen, more on this later).
  1276 this, in turn, can be rewritten as $map (\fuse\; \textit{bs}++\textit{bs1}) rs$ based on 
       
  1277 the properties of the function $\simp$(more on this later).
       
  1278 
       
  1279 simplification is done along the way, disallowing the structure of nested alternatives,
       
  1280 and this can actually be proven: simp(r) does not contain
       
  1281 nested alternatives.
       
  1282 
  1313 
  1283 %CONSTRUCTION SITE HERE
  1314 %CONSTRUCTION SITE HERE
  1284 that is to say, despite the bits are being moved around on the regular expression
  1315 that is to say, despite the bits are being moved around on the regular expression
  1285 (difference in bits), the structure of the (unannotated)regular expression
  1316 (difference in bits), the structure of the (unannotated)regular expression
  1286 after one simplification is exactly the same after the 
  1317 after one simplification is exactly the same after the