--- a/etnms/etnms.tex Sat Feb 08 22:24:21 2020 +0000
+++ b/etnms/etnms.tex Sun Feb 16 22:28:26 2020 +0000
@@ -473,7 +473,7 @@
$\textit{a}$ & $::=$ & $\ZERO$\\
& $\mid$ & $_{bs}\ONE$\\
& $\mid$ & $_{bs}{\bf c}$\\
- & $\mid$ & $_{bs}\oplus\,as$\\
+ & $\mid$ & $_{bs}\sum\,as$\\
& $\mid$ & $_{bs}a_1\cdot a_2$\\
& $\mid$ & $_{bs}a^*$
\end{tabular}
@@ -483,7 +483,7 @@
\noindent
where $bs$ stands for bitcodes, $a$ for $\bold{a}$nnotated regular
expressions and $as$ for a list of annotated regular expressions.
-The alternative constructor($\oplus$) has been generalized to
+The alternative constructor($\sum$) has been generalized to
accept a list of annotated regular expressions rather than just 2.
We will show that these bitcodes encode information about
the (POSIX) value that should be generated by the Sulzmann and Lu
@@ -502,7 +502,7 @@
$(\ONE)^\uparrow$ & $\dn$ & $_{[]}\ONE$\\
$(c)^\uparrow$ & $\dn$ & $_{[]}{\bf c}$\\
$(r_1 + r_2)^\uparrow$ & $\dn$ &
- $_{[]}\oplus[\textit{fuse}\,[0]\,r_1^\uparrow,\,
+ $_{[]}\sum[\textit{fuse}\,[0]\,r_1^\uparrow,\,
\textit{fuse}\,[1]\,r_2^\uparrow]$\\
$(r_1\cdot r_2)^\uparrow$ & $\dn$ &
$_{[]}r_1^\uparrow \cdot r_2^\uparrow$\\
@@ -526,8 +526,8 @@
$_{bs @ bs'}\ONE$\\
$\textit{fuse}\;bs\;_{bs'}{\bf c}$ & $\dn$ &
$_{bs@bs'}{\bf c}$\\
- $\textit{fuse}\;bs\,_{bs'}\oplus\textit{as}$ & $\dn$ &
- $_{bs@bs'}\oplus\textit{as}$\\
+ $\textit{fuse}\;bs\,_{bs'}\sum\textit{as}$ & $\dn$ &
+ $_{bs@bs'}\sum\textit{as}$\\
$\textit{fuse}\;bs\; _{bs'}a_1\cdot a_2$ & $\dn$ &
$_{bs@bs'}a_1 \cdot a_2$\\
$\textit{fuse}\;bs\,_{bs'}a^*$ & $\dn$ &
@@ -595,12 +595,12 @@
$(_{bs}{\bf d})\,\backslash c$ & $\dn$ &
$\textit{if}\;c=d\; \;\textit{then}\;
_{bs}\ONE\;\textit{else}\;\ZERO$\\
- $(_{bs}\oplus \;\textit{as})\,\backslash c$ & $\dn$ &
- $_{bs}\oplus\;(\textit{as.map}(\backslash c))$\\
+ $(_{bs}\sum \;\textit{as})\,\backslash c$ & $\dn$ &
+ $_{bs}\sum\;(\textit{as.map}(\backslash c))$\\
$(_{bs}\;a_1\cdot a_2)\,\backslash c$ & $\dn$ &
$\textit{if}\;\textit{bnullable}\,a_1$\\
- & &$\textit{then}\;_{bs}\oplus\,[(_{[]}\,(a_1\,\backslash c)\cdot\,a_2),$\\
- & &$\phantom{\textit{then},\;_{bs}\oplus\,}(\textit{fuse}\,(\textit{bmkeps}\,a_1)\,(a_2\,\backslash c))]$\\
+ & &$\textit{then}\;_{bs}\sum\,[(_{[]}\,(a_1\,\backslash c)\cdot\,a_2),$\\
+ & &$\phantom{\textit{then},\;_{bs}\sum\,}(\textit{fuse}\,(\textit{bmkeps}\,a_1)\,(a_2\,\backslash c))]$\\
& &$\textit{else}\;_{bs}\,(a_1\,\backslash c)\cdot a_2$\\
$(_{bs}a^*)\,\backslash c$ & $\dn$ &
$_{bs}(\textit{fuse}\, [0] \; r\,\backslash c)\cdot
@@ -625,7 +625,7 @@
head of the regular expression $a_2 \backslash c$ by fusing to it. The
bitsequence $\textit{bs}$, which was initially attached to the
first element of the sequence $a_1 \cdot a_2$, has
-now been elevated to the top-level of $\oplus$, as this information will be
+now been elevated to the top-level of $\sum$, as this information will be
needed whichever way the sequence is matched---no matter whether $c$ belongs
to $a_1$ or $ a_2$. After building these derivatives and maintaining all
the lexing information, we complete the lexing by collecting the
@@ -637,10 +637,10 @@
\begin{center}
\begin{tabular}{lcl}
$\textit{bmkeps}\,(_{bs}\ONE)$ & $\dn$ & $bs$\\
- $\textit{bmkeps}\,(_{bs}\oplus a::\textit{as})$ & $\dn$ &
+ $\textit{bmkeps}\,(_{bs}\sum a::\textit{as})$ & $\dn$ &
$\textit{if}\;\textit{bnullable}\,a$\\
& &$\textit{then}\;bs\,@\,\textit{bmkeps}\,a$\\
- & &$\textit{else}\;bs\,@\,\textit{bmkeps}\,(_{bs}\oplus \textit{as})$\\
+ & &$\textit{else}\;bs\,@\,\textit{bmkeps}\,(_{bs}\sum \textit{as})$\\
$\textit{bmkeps}\,(_{bs} a_1 \cdot a_2)$ & $\dn$ &
$bs \,@\,\textit{bmkeps}\,a_1\,@\, \textit{bmkeps}\,a_2$\\
$\textit{bmkeps}\,(_{bs}a^*)$ & $\dn$ &
@@ -684,7 +684,7 @@
expressions and strings.
One modification we introduced is to allow a list of annotated regular
-expressions in the $\oplus$ constructor. This allows us to not just
+expressions in the $\sum$ constructor. This allows us to not just
delete unnecessary $\ZERO$s and $\ONE$s from regular expressions, but
also unnecessary ``copies'' of regular expressions (very similar to
simplifying $r + r$ to just $r$, but in a more general setting). Another
@@ -707,10 +707,10 @@
&&$\quad\textit{case} \; (a_1', \ONE) \Rightarrow \textit{fuse} \; bs \; a_1'$ \\
&&$\quad\textit{case} \; (a_1', a_2') \Rightarrow _{bs}a_1' \cdot a_2'$ \\
- $\textit{simp} \; (_{bs}\oplus \textit{as})$ & $\dn$ & $\textit{distinct}( \textit{flatten} ( \textit{map simp as})) \; \textit{match} $ \\
+ $\textit{simp} \; (_{bs}\sum \textit{as})$ & $\dn$ & $\textit{distinct}( \textit{flatten} ( \textit{map simp as})) \; \textit{match} $ \\
&&$\quad\textit{case} \; [] \Rightarrow \ZERO$ \\
&&$\quad\textit{case} \; a :: [] \Rightarrow \textit{fuse bs a}$ \\
- &&$\quad\textit{case} \; as' \Rightarrow _{bs}\oplus \textit{as'}$\\
+ &&$\quad\textit{case} \; as' \Rightarrow _{bs}\sum \textit{as'}$\\
$\textit{simp} \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$
\end{tabular}
@@ -722,16 +722,16 @@
sequence, it will try to simplify its children regular expressions
recursively and then see if one of the children turn into $\ZERO$ or
$\ONE$, which might trigger further simplification at the current level.
-The most involved part is the $\oplus$ clause, where we use two
+The most involved part is the $\sum$ clause, where we use two
auxiliary functions $\textit{flatten}$ and $\textit{distinct}$ to open up nested
alternatives and reduce as many duplicates as possible. Function
$\textit{distinct}$ keeps the first occurring copy only and remove all later ones
-when detected duplicates. Function $\textit{flatten}$ opens up nested $\oplus$s.
+when detected duplicates. Function $\textit{flatten}$ opens up nested $\sum$s.
Its recursive definition is given below:
\begin{center}
\begin{tabular}{@{}lcl@{}}
- $\textit{flatten} \; (_{bs}\oplus \textit{as}) :: \textit{as'}$ & $\dn$ & $(\textit{map} \;
+ $\textit{flatten} \; (_{bs}\sum \textit{as}) :: \textit{as'}$ & $\dn$ & $(\textit{map} \;
(\textit{fuse}\;bs)\; \textit{as}) \; @ \; \textit{flatten} \; as' $ \\
$\textit{flatten} \; \ZERO :: as'$ & $\dn$ & $ \textit{flatten} \; \textit{as'} $ \\
$\textit{flatten} \; a :: as'$ & $\dn$ & $a :: \textit{flatten} \; \textit{as'}$ \quad(otherwise)
@@ -961,10 +961,10 @@
\begin{tabular}{@{}l@{\hspace{2mm}}c@{\hspace{2mm}}l@{}}
$\textit{retrieve}\,(_{bs}\ONE)\,\Empty$ & $\dn$ & $bs$\\
$\textit{retrieve}\,(_{bs}{\bf c})\,(\Char\,d)$ & $\dn$ & $bs$\\
- $\textit{retrieve}\,(_{bs}\oplus a::as)\,(\Left\,v)$ & $\dn$ &
+ $\textit{retrieve}\,(_{bs}\sum a::as)\,(\Left\,v)$ & $\dn$ &
$bs \,@\, \textit{retrieve}\,a\,v$\\
- $\textit{retrieve}\,(_{bs}\oplus a::as)\,(\Right\,v)$ & $\dn$ &
- $\textit{bs} \,@\, \textit{retrieve}\,(_{[]}\oplus as)\,v$\\
+ $\textit{retrieve}\,(_{bs}\sum a::as)\,(\Right\,v)$ & $\dn$ &
+ $\textit{bs} \,@\, \textit{retrieve}\,(_{[]}\sum as)\,v$\\
$\textit{retrieve}\,(_{bs}a_1\cdot a_2)\,(\Seq\,v_1\,v_2)$ & $\dn$ &
$bs \,@\,\textit{retrieve}\,a_1\,v_1\,@\, \textit{retrieve}\,a_2\,v_2$\\
$\textit{retrieve}\,(_{bs}a^*)\,(\Stars\,[])$ & $\dn$ &
@@ -1165,19 +1165,19 @@
\noindent
-and the difinition of $\flatten$:
+and the definition of $\flatten$:
\begin{center}
\begin{tabular}{c c c}
$\flatten \; []$ & $\dn$ & $[]$\\
$\flatten \; \ZERO::rs$ & $\dn$ & $rs$\\
- $\flatten \;(_{\textit{bs}_1}\oplus \textit{rs}_1 ::rs)$ & $\dn$ & $(\map \, (\fuse \, \textit{bs}_1) \,\textit{rs}_1) ::: \flatten(rs)$\\
+ $\flatten \;(_{\textit{bs}_1}\sum \textit{rs}_1 ::rs)$ & $\dn$ & $(\map \, (\fuse \, \textit{bs}_1) \,\textit{rs}_1) ::: \flatten(rs)$\\
$\flatten \; r :: rs$ & $\dn$ & $r::\flatten(rs)$
\end{tabular}
\end{center}
\noindent
If we call $\simp$ on $\rup\backslash a$, just as $\backslash_{simp}$
-requires, then we would go throught the third clause of
+requires, then we would go through the third clause of
the sequence case:$\quad\textit{case} \; (\ONE, a_2') \Rightarrow \textit{fuse} \; bs \; a_2'$.
The $\ZERO$ of $(_0\ONE + \ZERO)$ is thrown away
by $\flatten$ and
@@ -1190,7 +1190,11 @@
_0(_0 \ONE + _1(_1\ONE \cdot a^*))
$, and this simplifies to $_0(_0 \ONE + _{11} a^*)$
by the third clause of the alternative case:
- $\quad\textit{case} \; as' \Rightarrow \textit{ALTS}\;bs\;as'$.
+ \begin{center}
+ $\quad\textit{case} \; as' \Rightarrow _{bs}\sum{as'}$.
+ \end{center}
+
+ \noindent
The outmost bit $_0$ stays with
the outmost regular expression, rather than being fused to
its child regular expressions, as what we will later see happens
@@ -1240,8 +1244,8 @@
\begin{center}
\begin{tabular}{lcl}
$D =\{ (r_1 \cdot r_2,\; [c_1c_2]) \mid $ & $\simp(r_2) = r_2, \simp(r_1 \backslash c_1) = \ONE,$\\
- $r_1 \; not \; \nullable, c_2 \in L(r_2),$ & $\exists \textit{rs},\textit{bs}.\; r_2 \backslash c_2 = _{bs}{\oplus rs}$\\
-$\exists \textit{rs}_1. \; \simp(r_2 \backslash c_2) = _{bs}{\oplus \textit{rs}_1}$ & $and \;\simp(r_1 \backslash [c_1c_2]) = \ZERO\}$\\
+ $r_1 \; not \; \nullable, c_2 \in L(r_2),$ & $\exists \textit{rs},\textit{bs}.\; r_2 \backslash c_2 = _{bs}{\sum rs}$\\
+$\exists \textit{rs}_1. \; \simp(r_2 \backslash c_2) = _{bs}{\sum \textit{rs}_1}$ & $and \;\simp(r_1 \backslash [c_1c_2]) = \ZERO\}$\\
\end{tabular}
\end{center}
We take a pair $(r, \;s)$ from the set $D$.
@@ -1265,9 +1269,9 @@
\end{center}
which is equal to
\begin{center}
-$\simp \left[ \fuse \; \textit{bs}_2 \; _{bs}{\oplus rs} \right]$\\
-$=\simp \left[ \; _{bs_2++bs}{\oplus rs} \right]$\\
-$= \; _{bs_2++bs}{\oplus \textit{rs}_1} $
+$\simp \left[ \fuse \; \textit{bs}_2 \; _{bs}{\sum rs} \right]$\\
+$=\simp \left[ \; _{bs_2++bs}{\sum rs} \right]$\\
+$= \; _{bs_2++bs}{\sum \textit{rs}_1} $
\end{center}
\noindent
by using the properties from the set $D$ again
@@ -1284,34 +1288,34 @@
the following chain of equalities:
\begin{center}
$\simp(\rup \backslash s)= \simp \left[ ((r_1 \cdot r_2 )\backslash c_1) \backslash c_2 \right]= \simp\left[ \left(r_1 \backslash c_1\right) \cdot r_2 \big) \backslash c_2 \right]$\\
-$= \simp \left[ \oplus[\big( \left(r_1 \backslash c_1\right) \backslash c_2 \big) \cdot r_2 \; , \; \fuse \; (\bmkeps \;r_1\backslash c_1) \; r_2 \backslash c_2 ] \right]$,
+$= \simp \left[ \sum[\big( \left(r_1 \backslash c_1\right) \backslash c_2 \big) \cdot r_2 \; , \; \fuse \; (\bmkeps \;r_1\backslash c_1) \; r_2 \backslash c_2 ] \right]$,
\end{center}
\noindent
as before, we call the bitcode returned by
$\bmkeps(r_1\backslash c_1)$ as
$\textit{bs}_2$.
Also, $\simp(r_2 \backslash c_2)$ is
-$_{bs}\oplus \textit{rs}_1$,
+$_{bs}\sum \textit{rs}_1$,
and $( \left(r_1 \backslash c_1\right) \backslash c_2 \cdot r_2)$
simplifies to $\ZERO$,
so the above term can be expanded as
\begin{center}
\begin{tabular}{l}
-$\textit{distinct}(\flatten[\ZERO\;, \; _{\textit{bs}_2++\textit{bs}}\oplus \textit{rs}_1] ) \; \textit{match} $ \\
+$\textit{distinct}(\flatten[\ZERO\;, \; _{\textit{bs}_2++\textit{bs}}\sum \textit{rs}_1] ) \; \textit{match} $ \\
$\textit{case} \; [] \Rightarrow \ZERO$ \\
$\textit{case} \; a :: [] \Rightarrow \textit{\fuse \; \textit{bs} a}$ \\
- $\textit{case} \; as' \Rightarrow _{[]}\oplus as'$\\
+ $\textit{case} \; as' \Rightarrow _{[]}\sum as'$\\
\end{tabular}
\end{center}
\noindent
Applying the definition of $\flatten$, we get
\begin{center}
-$_{[]}\oplus (\textit{map} \; \fuse (\textit{bs}_2 ++ bs) rs_1)$
+$_{[]}\sum (\textit{map} \; \fuse (\textit{bs}_2 ++ bs) rs_1)$
\end{center}
\noindent
compared to the result
\begin{center}
-$ \; _{bs_2++bs}{\oplus \textit{rs}_1} $
+$ \; _{bs_2++bs}{\sum \textit{rs}_1} $
\end{center}
\noindent
Note how these two regular expressions only
@@ -1325,7 +1329,7 @@
caused $\simp(\rup \backslash s)$ to
generate the nested alternatives structure:
\begin{center}
-$ \oplus[\ZERO \;, \; _{bs}\oplus \textit{rs} ]$
+$ \sum[\ZERO \;, \; _{bs}\sum \textit{rs} ]$
\end{center}
and this will always trigger the $\flatten$ to
spill out the inner alternative's bitcode $\textit{bs}$,
@@ -1340,9 +1344,9 @@
Like, for the above regular expression, we might
just delete the outer layer of alternative
\begin{center}
-\st{$ {\oplus[\ZERO \;,}$} $_{bs}\oplus \textit{rs}$ \st{$]$}
+\st{$ {\sum[\ZERO \;,}$} $_{bs}\sum \textit{rs}$ \st{$]$}
\end{center}
-and get $_{bs}\oplus \textit{rs}$ instead, without
+and get $_{bs}\sum \textit{rs}$ instead, without
fusing the bits $\textit{bs}$ inside to every element
of $\textit{rs}$.
This idea can be realized by making the following
@@ -1352,13 +1356,13 @@
$\textit{simp}' \; (_{\textit{bs}}(a_1 \cdot a_2))$ & $\dn$ & $\textit{as} \; \simp \; \textit{was} \; \textit{before} $ \\
- $\textit{simp}' \; (_{bs}\oplus as)$ & $\dn$ & \st{$\textit{distinct}( \textit{flatten} ( \textit{map simp as})) \; \textit{match} $} \\
+ $\textit{simp}' \; (_{bs}\sum as)$ & $\dn$ & \st{$\textit{distinct}( \textit{flatten} ( \textit{map simp as})) \; \textit{match} $} \\
&&\st{$\quad\textit{case} \; [] \Rightarrow \ZERO$} \\
&&\st{$\quad\textit{case} \; a :: [] \Rightarrow \textit{fuse bs a}$} \\
&&\st{$\quad\textit{case} \; as' \Rightarrow \textit{ALTS}\;bs\;as'$}\\
&&$\textit{if}(\textit{hollowAlternatives}( \textit{map \; simp \; as}))$\\
&&$\textit{then} \; \fuse \; \textit{bs}\; \textit{extractAlt}(\textit{map} \; \simp \; \textit{as} )$\\
- &&$\textit{else} \; \simp(_{bs} \oplus \textit{as})$\\
+ &&$\textit{else} \; \simp(_{bs} \sum \textit{as})$\\
$\textit{simp}' \; a$ & $\dn$ & $\textit{a} \qquad \textit{otherwise}$
@@ -1369,23 +1373,23 @@
given the definition of $\textit{hollowAlternatives}$ and $\textit{extractAlt}$ :
\begin{center}
$\textit{hollowAlternatives}( \textit{rs}) \dn
-\exists r = (_{\textit{bs}_1}\oplus \textit{rs}_1) \in \textit{rs}. \forall r' \in \textit{rs}, \;
+\exists r = (_{\textit{bs}_1}\sum \textit{rs}_1) \in \textit{rs}. \forall r' \in \textit{rs}, \;
\textit{either} \; r' = \ZERO \; \textit{or} \; r' = r $
$\textit{extractAlt}( \textit{rs}) \dn \textit{if}\big(
-\exists r = (_{\textit{bs}_1}\oplus \textit{rs}_1) \in \textit{rs}. \forall r' \in \textit{rs}, \;
+\exists r = (_{\textit{bs}_1}\sum \textit{rs}_1) \in \textit{rs}. \forall r' \in \textit{rs}, \;
\textit{either} \; r' = \ZERO \; \textit{or} \; r' = r \big)\; \textit{then} \; \textit{return} \; r$
\end{center}
\noindent
Basically, $\textit{hollowAlternatives}$ captures the feature of
a list of regular expression of the shape
\begin{center}
-$ \oplus[\ZERO \;, \; _{bs}\oplus \textit{rs} ]$
+$ \sum[\ZERO \;, \; _{bs}\sum \textit{rs} ]$
\end{center}
and this means we can simply elevate the
-inner regular expression $_{bs}\oplus \textit{rs}$
+inner regular expression $_{bs}\sum \textit{rs}$
to the outmost
and throw away the useless $\ZERO$s and
-the outer $\oplus$ wrapper.
+the outer $\sum$ wrapper.
Using this new definition of simp,
under the example where $r$ is the regular expression
$(a+b)(a+a*)$ and $s$ is the string $aa$
@@ -1460,9 +1464,9 @@
$\textit{sub}(r_1 + r_2) \dn \textit{sub}(r_1) \cup \textit{sub}(r_2) \cup \{r_1+r_2\}$\\
\end{center}
$\textit{good}(r) \dn r \neq \ZERO \land \\
-\forall r' \in \textit{sub}(r), \textit{if} \; r' = _{bs_1}\oplus(rs_1), \;
+\forall r' \in \textit{sub}(r), \textit{if} \; r' = _{bs_1}\sum(rs_1), \;
\textit{then} \nexists r'' \in \textit{rs}_1 \; s.t.\;
-r'' = _{bs_2}\oplus \textit{rs}_2 $
+r'' = _{bs_2}\sum \textit{rs}_2 $
The properties are mainly the ones below:
\begin{itemize}
@@ -1507,12 +1511,12 @@
$\textit{bs} @ \textit{bs}_1 @ \textit{bs}_2$\\
$\textit{if} \; r \gg \textit{bs}_1$ & & $\textit{then}$ &
- $_{bs}{\oplus(r :: \textit{rs}})$ &
+ $_{bs}{\sum(r :: \textit{rs}})$ &
$\gg$ &
$\textit{bs} @ \textit{bs}_1 $\\
- $\textit{if} \; _{bs}(\oplus \textit{rs}) \gg \textit{bs} @ \textit{bs}_1$ & & $\textit{then}$ &
- $_{bs}{\oplus(r :: \textit{rs}})$ &
+ $\textit{if} \; _{bs}(\sum \textit{rs}) \gg \textit{bs} @ \textit{bs}_1$ & & $\textit{then}$ &
+ $_{bs}{\sum(r :: \textit{rs}})$ &
$\gg$ &
$\textit{bs} @ \textit{bs}_1 $\\