diff -r e71a6e2aca2d -r 35df9cdd36ca ChengsongTanPhdThesis/Chapters/Bitcoded2.tex --- a/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Mon Aug 15 17:26:08 2022 +0200 +++ b/ChengsongTanPhdThesis/Chapters/Bitcoded2.tex Wed Aug 17 01:09:13 2022 +0100 @@ -17,13 +17,82 @@ \section{Simplification for Annotated Regular Expressions} The first thing we notice in the fast growth of examples such as $(a^*a^*)^*$'s and $(a^* + (aa)^*)^*$'s derivatives is that a lot of duplicated sub-patterns -are scattered around different levels: - +are scattered around different levels, and therefore requires +de-duplication at different levels: \begin{center} - $(a^*a^*)^* \rightarrow (a^*a^* + a^*)\cdot(a^*a^*)^*$\\ + $(a^*a^*)^* \rightarrow (a^*a^* + a^*)\cdot(a^*a^*)^* \rightarrow $\\ $((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$ \end{center} \noindent +As we have already mentioned in \ref{eqn:growth2}, +a simple-minded simplification function cannot simplify +$((a^*a^* + a^*) + a^*)\cdot(a^*a^*)^* + (a^*a^* + a^*)\cdot(a^*a^*)^*$ +any further. +we would expect a better simplification function to work in the +following way: +\begin{gather*} + ((a^*a^* + \underbrace{a^*}_\text{A})+\underbrace{a^*}_\text{duplicate of A})\cdot(a^*a^*)^* + + \underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this}.\\ + \bigg\downarrow \\ + (a^*a^* + a^* + \color{gray} + a^* \color{black})\cdot(a^*a^*)^* + + \underbrace{(a^*a^* + a^*)\cdot(a^*a^*)^*}_\text{further simp removes this} \\ + \bigg\downarrow \\ + (a^*a^* + a^* + )\cdot(a^*a^*)^* + \color{gray} + (a^*a^* + a^*) \cdot(a^*a^*)^* +\end{gather*} +\noindent +This motivating example came from testing Sulzmann and Lu's +algorithm: their simplification does +not work! +We quote their $\textit{simp}$ function verbatim here: +\begin{center} + \begin{tabular}{lcl} + $\simp\; _{bs}(_{bs'}\ONE \cdot r)$ & $\dn$ & + $\textit{if} \; (\textit{zeroable} \; r)\; \textit{then} \;\; \ZERO$\\ + & &$\textit{else}\;\; \fuse \; (bs@ bs') \; r$\\ + $\simp\;(_{bs}r_1\cdot r_2)$ & $\dn$ & $\textit{if} + \; (\textit{zeroable} \; r_1 \; \textit{or} \; \textit{zeroable}\; r_2)\; + \textit{then} \;\; \ZERO$\\ + & & $\textit{else}\;\;_{bs}((\simp\;r_1)\cdot + (\simp\; r_2))$\\ + $\simp \; _{bs}\sum []$ & $\dn$ & $\ZERO$\\ + $\simp \; _{bs}\sum ((_{bs'}\sum rs_1) :: rs_2)$ & $\dn$ & + $_{bs}\sum ((\map \; (\fuse \; bs')\; rs_1) @ rs_2)$\\ + $\simp \; _{bs}\sum[r]$ & $\dn$ & $\fuse \; bs \; (\simp \; r)$\\ + $\simp \; _{bs}\sum(r::rs)$ & $\dn$ & $_{bs}\sum + (\nub \; (\filter \; (\not \circ \zeroable)\;((\simp \; r) :: \map \; \simp \; rs)))$\\ + + \end{tabular} +\end{center} +\noindent +the $\textit{zeroable}$ predicate +which tests whether the regular expression +is equivalent to $\ZERO$, +is defined as: +\begin{center} + \begin{tabular}{lcl} + $\zeroable \; _{bs}\sum (r::rs)$ & $\dn$ & $\zeroable \; r\;\; \land \;\; + \zeroable \;_{[]}\sum\;rs $\\ + $\zeroable\;_{bs}(r_1 \cdot r_2)$ & $\dn$ & $\zeroable\; r_1 \;\; \lor \;\; \zeroable \; r_2$\\ + $\zeroable\;_{bs}r^*$ & $\dn$ & $\textit{false}$ \\ + $\zeroable\;_{bs}c$ & $\dn$ & $\textit{false}$\\ + $\zeroable\;_{bs}\ONE$ & $\dn$ & $\textit{false}$\\ + $\zeroable\;_{bs}\ZERO$ & $\dn$ & $\textit{true}$ + \end{tabular} +\end{center} +%\[ +%\begin{aligned} +%&\textit{isPhi} \left(b s @ r i^{*}\right)=\text { False } \\ +%&\textit{isPhi}\left(b s @ r i_{1} r i_{2}\right)=\textit{isPhi} r i_{1} \vee \textit{isPhi} r i_{2} \\ +%&\textit{isPhi} \left(b s @ r i_{1} \oplus r i_{2}\right)=\textit{isPhi} r i_{1} \wedge \textit{isPhi} r i_{2} \\ +%&\textit{isPhi}(b s @ l)= False \\ +%&\textit{isPhi}(b s @ \epsilon)= False \\ +%&\textit{isPhi} \; \ZERO = True +%\end{aligned} +%\] +\noindent Despite that we have already implemented the simple-minded simplifications such as throwing away useless $\ONE$s and $\ZERO$s. The simplification rule $r + r \rightarrow $ cannot make a difference either