# HG changeset patch # User Chengsong # Date 1562275689 -3600 # Node ID d2a7e87ea6e123045d4aa6be83a013abeb47185a # Parent 9b48724ec6095f9bd0ea8d80891c6d43c2aa36ea will not compile, just text diff -r 9b48724ec609 -r d2a7e87ea6e1 ninems/.DS_Store Binary file ninems/.DS_Store has changed diff -r 9b48724ec609 -r d2a7e87ea6e1 ninems/ninems.tex --- a/ninems/ninems.tex Thu Jul 04 11:04:02 2019 +0100 +++ b/ninems/ninems.tex Thu Jul 04 22:28:09 2019 +0100 @@ -735,38 +735,24 @@ setting). A psuedocode version of our algorithm is given below:\\ -\begin{algorithm} -\caption{simplification of annotated regular expression}\label{euclid} -\begin{algorithmic}[1] -\Procedure{$Simp$}{$areg$} -\Switch{$areg$} - \Case{$ALTS(bs, rs)$} - \For{\textit{rs[i] in array rs}} - \State $\textit{rs[i]} \gets$ \textit{Simp(rs[i])} - \EndFor - \For{\textit{rs[i] in array rs}} - \If{$rs[i] == ALTS(bs', rs')$} - \State $rs'' \gets$ attach bits $bs'$ to all elements in $rs'$ - \State Insert $rs''$ into $rs$ at position $i$ ($rs[i]$ is destroyed, replaced by its list of children regular expressions) - \EndIf - \EndFor - \State Remove all duplicates in $rs$, only keeping the first copy for multiple occurrences of the same regular expression - \State Remove all $0$s in $rs$ - \If{$ rs.length == 0$} \Return $ZERO$ \EndIf - \If {$ rs.length == 1$} \Return$ rs[0] $\EndIf - \EndCase - \Case{$SEQ(bs, r_1, r_2)$} - \If{$ r_1$ or $r_2$ is $ZERO$} \Return ZERO \EndIf - \State update $r_1$ and $r_2$ by attaching $bs$ to their front - \If {$r_1$ or $r_2$ is $ONE(bs')$} \Return $r_2$ or $r_1$ \EndIf - \EndCase - \Case{$Others$} - \Return $areg$ as it is - \EndCase -\EndSwitch -\EndProcedure -\end{algorithmic} -\end{algorithm} +simp r \defn r if r = ONE bs or CHAR bs c or STAR bs r +simp SEQ bs r_1 r_2 \defn \\ +case (simp(r_1), simp(r_2) ) of (0, _) => 0 +(_,0) => 0 +(1, r) => fuse bs r +(r,1) => fuse bs r +(r_1, r_2) => SEQ bs r_1 r_2 +simp ALT bs rs = distinct(flatten( map simp rs)) match +case Nil => ZERO +case r::Nil => fuse bs r +case rs => ALT bs rs + +The simplification does a pattern matching on the regular expression. When it detected that +the regular expression is an alternative or sequence, it will try to simplify its children regular expressions +recursively and then see if one of the children turn into 0 or 1, which might trigger further simplification + at the current level. The most involved part is the ALTS clause, where we use two auxiliary functions + flatten and distinct to open up nested ALT and reduce as many duplicates as possible. + With this simplification our previous $(a + aa)^*$ example's 8000 nodes will be reduced to only 6. Another modification is that we use simplification rules