924 during simplification in parallel with the regular expression that |
925 during simplification in parallel with the regular expression that |
925 has not been simplified in the subsequent derivative operations. To aid this, |
926 has not been simplified in the subsequent derivative operations. To aid this, |
926 we use the helper function retrieve described by Sulzmann and Lu: |
927 we use the helper function retrieve described by Sulzmann and Lu: |
927 \begin{center} |
928 \begin{center} |
928 \begin{tabular}{@{}l@{\hspace{2mm}}c@{\hspace{2mm}}l@{}} |
929 \begin{tabular}{@{}l@{\hspace{2mm}}c@{\hspace{2mm}}l@{}} |
929 $\textit{retrieve}\,(\textit{ONE}\,bs)\,\Empty$ & $\dn$ & $bs$\\ |
930 $\textit{retrieve}\,(_{bs}\ONE)\,\Empty$ & $\dn$ & $bs$\\ |
930 $\textit{retrieve}\,(\textit{CHAR}\,bs\,c)\,(\Char\,d)$ & $\dn$ & $bs$\\ |
931 $\textit{retrieve}\,(_{bs}{\bf c})\,(\Char\,d)$ & $\dn$ & $bs$\\ |
931 $\textit{retrieve}\,(\textit{ALTS}\,bs\,a::as)\,(\Left\,v)$ & $\dn$ & |
932 $\textit{retrieve}\,(\textit{ALTS}\,bs\,a::as)\,(\Left\,v)$ & $\dn$ & |
932 $bs \,@\, \textit{retrieve}\,a\,v$\\ |
933 $bs \,@\, \textit{retrieve}\,a\,v$\\ |
933 $\textit{retrieve}\,(\textit{ALTS}\,bs\,a::as)\,(\Right\,v)$ & $\dn$ & |
934 $\textit{retrieve}\,(\textit{ALTS}\,bs\,a::as)\,(\Right\,v)$ & $\dn$ & |
934 $bs \,@\, \textit{retrieve}\,(\textit{ALTS}\,bs\,as)\,v$\\ |
935 $bs \,@\, \textit{retrieve}\,(\textit{ALTS}\,bs\,as)\,v$\\ |
935 $\textit{retrieve}\,(\textit{SEQ}\,bs\,a_1\,a_2)\,(\Seq\,v_1\,v_2)$ & $\dn$ & |
936 $\textit{retrieve}\,(\textit{SEQ}\,bs\,a_1\,a_2)\,(\Seq\,v_1\,v_2)$ & $\dn$ & |
1212 from the above example which |
1213 from the above example which |
1213 trigger the difference between |
1214 trigger the difference between |
1214 $\rup\backslash_{simp} \, s$ |
1215 $\rup\backslash_{simp} \, s$ |
1215 and $\simp(\rup\backslash s)$: |
1216 and $\simp(\rup\backslash s)$: |
1216 \[ |
1217 \[ |
1217 D = (c_1c_2, \{r_1\cdot r_2 \mid \text{$ c_1\in L(r_1), r_1 \; not \; \nullable, c_2 \in L(r_2), |
1218 D = (c_1c_2, \{r_1\cdot r_2 \mid \text{$\simp(r_2) = r_2, \simp(r_1 \backslash c_1) = \ONE, r_1 \; not \; \nullable, c_2 \in L(r_2), |
1218 r_2 \backslash c_2$ is of shape alternative and $c_1c_2 \notin L(r_1)$}\}) |
1219 \simp(r_2 \backslash c_2)$ is of shape alternative and $c_1c_2 \notin L(r_1)$}\}) |
1219 \] |
1220 \] |
1220 and we can use this to construct a set of examples based |
1221 The culprit is the different order in which simplification is applied. |
1221 on this type of behaviour of two operations: |
1222 For $\rup \backslash_{simp} s$, |
1222 $r_1$ |
1223 \begin{center} |
|
1224 \begin{tabular}{lcl} |
|
1225 $(r_1\cdot r_2)\backslash_{simp} \, [c_1c_2]$ & $= \simp\left[ \big(\simp\left[ \left( r_1\cdot r_2 \right) \backslash c_1\right] \big)\backslash c_2\right]$\\ |
|
1226 & $= \simp\left[ \big(\simp \left[ \left(r_1 \backslash c_1\right) \cdot r_2 \right] \big) \backslash c_2 \right]$\\ |
|
1227 & $= \simp \left[ (\fuse \; \bmkeps(r_1\backslash c_1) \; \simp(r_2) ) \backslash c_2 \right]$, |
|
1228 \end{tabular} |
|
1229 \end{center} |
|
1230 \noindent |
|
1231 from the definition of $D$ we know $r_1 \backslash c_1$ is nullable, therefore |
|
1232 $\bmkeps(r_1\backslash c_1)$ returns a bitcode, we shall call it |
|
1233 $bs$. Also, $\simp(r_2 \backslash c_2)$ is a certain alternative, |
|
1234 we shall call it $_{bs_1}\oplus rs$, and $( \left(r_1 \backslash c_1\right) \backslash c_2 \cdot r_2)$ |
|
1235 simplifies to $\ZERO$, |
|
1236 so the above term can be rewritten as |
|
1237 \begin{center} |
|
1238 \begin{tabular}{lcl} |
|
1239 $\textit{distinct}(\flatten[\ZERO\;, \; _{\textit{bs++bs1}}\oplus rs] ) \; \textit{match} $ \\ |
|
1240 $\textit{case} \; [] \Rightarrow \ZERO$ \\ |
|
1241 $\textit{case} \; a :: [] \Rightarrow \textit{fuse bs a}$ \\ |
|
1242 $\textit{case} \; as' \Rightarrow \textit{ALTS}\;bs\;as'$\\ |
|
1243 \end{tabular} |
|
1244 \end{center} |
|
1245 this, in turn, can be rewritten as $map (\fuse \textit{bs}++\textit{bs1}) rs$ based on |
|
1246 the properties of the function $\simp$(more on this later). |
|
1247 simplification is done along the way, disallowing the structure of nested alternatives, |
|
1248 and this can actually be proven: simp(r) does not contain |
|
1249 nested alternatives. |
|
1250 \[ |
|
1251 \simp \left[(r_1\cdot r_2) \backslash [c_1c_2] \right]= \simp \left[ ((r_1 \cdot r_2 )\backslash c_1) \backslash c_2 \right] |
|
1252 \] |
|
1253 For $\simp(\rup \backslash s)$, |
|
1254 \begin{center} |
|
1255 \begin{tabular}{lcl} |
|
1256 $\simp\big[(r_1\cdot r_2)\backslash \, [c_1c_2]\big]$ & $= \simp\left[ \big( \left( r_1\cdot r_2 \right) \backslash c_1 \big)\backslash c_2\right]$\\ |
|
1257 & $= \simp\left[ \left(r_1 \backslash c_1\right) \cdot r_2 \big) \backslash c_2 \right]$\\ |
|
1258 & $= \simp \left[ \oplus[\big( \left(r_1 \backslash c_1\right) \backslash c_2 \big) \cdot r_2 \; , \; \fuse \; \bmkeps(r_1\backslash c_1) \; r_2 \backslash c_2 ] \right]$, |
|
1259 \end{tabular} |
|
1260 \end{center} |
|
1261 \noindent |
|
1262 as before, we call the bitcode returned by $\bmkeps(r_1\backslash c_1)$ |
|
1263 $bs$. Also, $\simp(r_2 \backslash c_2)$ is a certain alternative, |
|
1264 we shall call it $_{bs_1}\oplus rs$, and $( \left(r_1 \backslash c_1\right) \backslash c_2 \cdot r_2)$ |
|
1265 simplifies to $\ZERO$, |
|
1266 so the above term can be rewritten as |
|
1267 \begin{center} |
|
1268 \begin{tabular}{lcl} |
|
1269 $\textit{distinct}(\flatten[\ZERO\;, \; _{\textit{bs}++\textit{bs1}}\oplus rs] ) \; \textit{match} $ \\ |
|
1270 $\textit{case} \; [] \Rightarrow \ZERO$ \\ |
|
1271 $\textit{case} \; a :: [] \Rightarrow \textit{fuse bs a}$ \\ |
|
1272 $\textit{case} \; as' \Rightarrow \textit{ALTS}\;bs\;as'$\\ |
|
1273 \end{tabular} |
|
1274 \end{center} |
|
1275 |
|
1276 this, in turn, can be rewritten as $map (\fuse\; \textit{bs}++\textit{bs1}) rs$ based on |
|
1277 the properties of the function $\simp$(more on this later). |
|
1278 |
|
1279 simplification is done along the way, disallowing the structure of nested alternatives, |
|
1280 and this can actually be proven: simp(r) does not contain |
|
1281 nested alternatives. |
|
1282 |
|
1283 %CONSTRUCTION SITE HERE |
1223 that is to say, despite the bits are being moved around on the regular expression |
1284 that is to say, despite the bits are being moved around on the regular expression |
1224 (difference in bits), the structure of the (unannotated)regular expression |
1285 (difference in bits), the structure of the (unannotated)regular expression |
1225 after one simplification is exactly the same after the |
1286 after one simplification is exactly the same after the |
1226 same sequence of derivative operations |
1287 same sequence of derivative operations |
1227 regardless of whether we did simplification |
1288 regardless of whether we did simplification |
1228 along the way. |
1289 along the way. |
1229 One way would be to give a function that calls |
1290 One way would be to give a function that calls |
1230 %CONSTRUCTION SITE |
1291 |
1231 fuse is the culprit: it causes the order in which alternatives |
1292 fuse is the culprit: it causes the order in which alternatives |
1232 are opened up to be remembered and finally the difference |
1293 are opened up to be remembered and finally the difference |
1233 appear in $\simp(\rup \backslash s)$ and $\rup \backslash{simp} \,s$. |
1294 appear in $\simp(\rup \backslash s)$ and $\rup \backslash{simp} \,s$. |
1234 but we have to use them as they are essential in the simplification: |
1295 but we have to use them as they are essential in the simplification: |
1235 flatten needs them. |
1296 flatten needs them. |