--- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Mon Aug 15 17:26:08 2022 +0200
+++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Wed Aug 17 01:09:13 2022 +0100
@@ -17,13 +17,82 @@
\section{Simplification for Annotated Regular Expressions}
The first thing we notice in the fast growth of examples such as $(a^*a^*)^*$'s
and $(a^* + (aa)^*)^*$'s derivatives is that a lot of duplicated sub-patterns
-are scattered around different levels:
-
+are scattered around different levels, and therefore requires
+de-duplication at different levels:
\begin{center}
- $(a^*a^*)^* \rightarrow (a^*a^* + a^*)\cdot(a^*a^*)^*$\\
+ $(a^*a^*)^* \rightarrow (a^*a^* + a^*)\cdot(a^*a^*)^* \rightarrow $\\
$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$
\end{center}
\noindent
+As we have already mentioned in \ref{eqn:growth2},
+a simple-minded simplification function cannot simplify
+$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$
+any further.
+we would expect a better simplification function to work in the
+following way:
+\begin{gather*}
+ ((a^*a^* + \underbrace{a^*}_\text{A})+\underbrace{a^*}_\text{duplicate of A})\cdot(a^*a^*)^* +
+ \underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this}.\\
+ \bigg\downarrow \\
+ (a^*a^* + a^*
+ \color{gray} + a^* \color{black})\cdot(a^*a^*)^* +
+ \underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this} \\
+ \bigg\downarrow \\
+ (a^*a^* + a^*
+ )\cdot(a^*a^*)^*
+ \color{gray} + (a^*a^* + a^*) \cdot(a^*a^*)^*
+\end{gather*}
+\noindent
+This motivating example came from testing Sulzmann and Lu's
+algorithm: their simplification does
+not work!
+We quote their $\textit{simp}$ function verbatim here:
+\begin{center}
+ \begin{tabular}{lcl}
+ $\simp\; _{bs}(_{bs'}\ONE \cdot r)$ & $\dn$ &
+ $\textit{if} \; (\textit{zeroable} \; r)\; \textit{then} \;\; \ZERO$\\
+ & &$\textit{else}\;\; \fuse \; (bs@ bs') \; r$\\
+ $\simp\;(_{bs}r_1\cdot r_2)$ & $\dn$ & $\textit{if}
+ \; (\textit{zeroable} \; r_1 \; \textit{or} \; \textit{zeroable}\; r_2)\;
+ \textit{then} \;\; \ZERO$\\
+ & & $\textit{else}\;\;_{bs}((\simp\;r_1)\cdot
+ (\simp\; r_2))$\\
+ $\simp \; _{bs}\sum []$ & $\dn$ & $\ZERO$\\
+ $\simp \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ &
+ $_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\
+ $\simp \; _{bs}\sum[r]$ & $\dn$ & $\fuse \; bs \; (\simp \; r)$\\
+ $\simp \; _{bs}\sum(r::rs)$ & $\dn$ & $_{bs}\sum
+ (\nub \; (\filter \; (\not \circ \zeroable)\;((\simp \; r) :: \map \; \simp \; rs)))$\\
+
+ \end{tabular}
+\end{center}
+\noindent
+the $\textit{zeroable}$ predicate
+which tests whether the regular expression
+is equivalent to $\ZERO$,
+is defined as:
+\begin{center}
+ \begin{tabular}{lcl}
+ $\zeroable \; _{bs}\sum (r::rs)$ & $\dn$ & $\zeroable \; r\;\; \land \;\;
+ \zeroable \;_{[]}\sum\;rs $\\
+ $\zeroable\;_{bs}(r_1 \cdot r_2)$ & $\dn$ & $\zeroable\; r_1 \;\; \lor \;\; \zeroable \; r_2$\\
+ $\zeroable\;_{bs}r^*$ & $\dn$ & $\textit{false}$ \\
+ $\zeroable\;_{bs}c$ & $\dn$ & $\textit{false}$\\
+ $\zeroable\;_{bs}\ONE$ & $\dn$ & $\textit{false}$\\
+ $\zeroable\;_{bs}\ZERO$ & $\dn$ & $\textit{true}$
+ \end{tabular}
+\end{center}
+%\[
+%\begin{aligned}
+%&\textit{isPhi} \left(b s @ r i^{*}\right)=\text { False } \\
+%&\textit{isPhi}\left(b s @ r i_{1} r i_{2}\right)=\textit{isPhi} r i_{1} \vee \textit{isPhi} r i_{2} \\
+%&\textit{isPhi} \left(b s @ r i_{1} \oplus r i_{2}\right)=\textit{isPhi} r i_{1} \wedge \textit{isPhi} r i_{2} \\
+%&\textit{isPhi}(b s @ l)= False \\
+%&\textit{isPhi}(b s @ \epsilon)= False \\
+%&\textit{isPhi} \; \ZERO = True
+%\end{aligned}
+%\]
+\noindent
Despite that we have already implemented the simple-minded simplifications
such as throwing away useless $\ONE$s and $\ZERO$s.
The simplification rule $r + r \rightarrow $ cannot make a difference either