cst_tests: comparison ninems/ninems.tex

equal deleted inserted replaced

-:9b48724ec609
+:d2a7e87ea6e1
 unnecessary ``copies'' of regular expressions (very similar to
 simplifying $r + r$ to just $r$, but in a more general
 setting).
 A psuedocode version of our algorithm is given below:\\
-\begin{algorithm}
+simp r \defn r if r = ONE bs or CHAR bs c or STAR bs r
-\caption{simplification of annotated regular expression}\label{euclid}
+simp SEQ bs r_1 r_2 \defn \\
-\begin{algorithmic}[1]
+case (simp(r_1), simp(r_2) ) of (0, _) => 0
-\Procedure{$Simp$}{$areg$}
+(_,0) => 0
-\Switch{$areg$}
+(1, r) => fuse bs r
-	\Case{$ALTS(bs, rs)$}
+(r,1) => fuse bs r
-		\For{\textit{rs[i] in array rs}}
+(r_1, r_2) => SEQ bs r_1 r_2
-			\State $\textit{rs[i]} \gets$ \textit{Simp(rs[i])}
+simp ALT bs rs = distinct(flatten( map simp rs)) match
-		\EndFor
+case Nil => ZERO
-		\For{\textit{rs[i] in array rs}}
+case r::Nil => fuse bs r
-			\If{$rs[i] == ALTS(bs', rs')$}
+case rs => ALT bs rs
-				\State $rs'' \gets$ attach bits $bs'$ to all elements in $rs'$
-				\State Insert $rs''$ into $rs$ at position $i$ ($rs[i]$ is destroyed, replaced by its list of children regular expressions)
+The simplification does a pattern matching on the regular expression. When it detected that
-			\EndIf
+the regular expression is an alternative or sequence, it will try to simplify its children regular expressions
-		\EndFor
+recursively and then see if one of the children turn into 0 or 1, which might trigger further simplification
-		\State Remove all duplicates in $rs$, only keeping the first copy for multiple occurrences of the same regular expression
+at the current level. The most involved part is the ALTS clause, where we use two auxiliary functions
-		\State Remove all $0$s in $rs$
+flatten and distinct to open up nested ALT and reduce as many duplicates as possible.
-		\If{$ rs.length == 0$} \Return $ZERO$ \EndIf
-		\If {$ rs.length == 1$} \Return$ rs[0] $\EndIf
-	\EndCase
-	\Case{$SEQ(bs, r_1, r_2)$}
-		\If{$ r_1$ or $r_2$ is $ZERO$} \Return ZERO \EndIf
-		\State update $r_1$ and $r_2$ by attaching $bs$ to their front
-		\If {$r_1$ or $r_2$ is $ONE(bs')$} \Return $r_2$ or $r_1$ \EndIf
-	\EndCase
-	\Case{$Others$}
-		\Return $areg$ as it is
-	\EndCase
-\EndSwitch
-\EndProcedure
-\end{algorithmic}
-\end{algorithm}
 With this simplification our previous $(a + aa)^*$ example's 8000 nodes will be reduced to only 6.
 Another modification is that we use simplification rules
 inspired by Antimirov's work on partial derivatives. They maintain the
 idea that only the first ``copy'' of a regular expression in an

changeset 47	d2a7e87ea6e1
parent 46	9b48724ec609
child 48	bbefcf7351f2