cst_tests: comparison ninems/ninems.tex

equal deleted inserted replaced

-:d256aabe88f3
+:866eda9ba66a
 idea that only the first ``copy'' of a regular expression in an
 alternative contributes to the calculation of a POSIX value. All
 subsequent copies can be pruned from the regular expression.
 A psuedocode version of our algorithm is given below:\\
+\[simp r \dn r if r = ONE bs or CHAR bs c or STAR bs r
-simp r \defn r if r = ONE bs or CHAR bs c or STAR bs r
+simp SEQ bs r_1 r_2 \dn \\
-simp SEQ bs r_1 r_2 \defn \\
 case (simp(r_1), simp(r_2) ) of (0, _) => 0
 (_,0) => 0
 (1, r) => fuse bs r
 (r,1) => fuse bs r
 (r_1, r_2) => SEQ bs r_1 r_2
 simp ALT bs rs = distinct(flatten( map simp rs)) match
 case Nil => ZERO
 case r::Nil => fuse bs r
 case rs => ALT bs rs
+\]
 The simplification does a pattern matching on the regular expression. When it detected that
 the regular expression is an alternative or sequence, it will try to simplify its children regular expressions
 recursively and then see if one of the children turn into 0 or 1, which might trigger further simplification
 at the current level. The most involved part is the ALTS clause, where we use two auxiliary functions
 flatten and distinct to open up nested ALT and reduce as many duplicates as possible.
 Function distinct  keeps the first occurring copy only and remove all later ones when detected duplicates.
 Function flatten opens up nested ALT. Its recursive definition is given below:
 \\
-flatten ALT bs rs :: rss = (map fuse( bs,_)  rs )@ flatten rss
+\[flatten ALT bs rs :: rss = (map fuse( bs,_)  rs )@ flatten rss
 flatten ZERO :: rss = flatten rss
 flatten r::rss = r @ flatten rss if r is of any other shape
+\]
 Here flatten behaves like the traditional functional programming flatten function,
 what it does is basically removing parentheses like changing $a+(b+c)$ into $a+b+c$.
 If we apply simplification after each derivative step, we get an optimized version of the algorithm:
-\\TODO definition of blexer_simp
+\\TODO definition of $blexer_simp$
 This algorithm effectively keeps the regular expression size small, for example,
 with this simplification our previous $(a + aa)^*$ example's 8000 nodes will be reduced to only 6 and stay constant, however long the input string is.
 \\definition of retrieve\\
 This function assembled the bitcode that corresponds to a parse tree for how the current derivative mathces the suffix of the string(the characters that have not yet appeared, but is stored in the value).
 Sulzmann and Lu used this to connect the bit-coded algorithm to the older algorithm by the following equation:\\
 inj r c v = decode retrieve (bder c (internalise r)) v
 Fahad and Christian also used this fact to prove  the correctness of bit-coded algorithm without simplificaiton.
-Our purpose of using this, however, is try to establish
+Our purpose of using this, however, is try to establish \\
-retrieve r v = retrieve simp(r) v'.
+$ retrieve \; r \; v \;=\; retrieve  \; simp(r) \; v'.$\\
 Although the r and v are different now,  we can still extract the bitsequence that gies the same parsing information.
 After establishing this, we might be able to finally bridge the gap of proving
-retrieve r \backslash s v = retrieve simp(r) \backslash s v'
+$retrieve \; r  \; \backslash  \;s \; v = \;retrieve \; simp(r) \; \backslash \; s \; v'$
 and subsequently
-retrieve r \backslash s v = retrieve r \backslash{with simplification} s v'.
+$retrieve \; r \; \backslash \; s \; v\; = \; retrieve \; r \; \backslash{with simplification}  \; s \; v'$.
 This proves that our simplified version of regular expression still contains all the bitcodes neeeded.
 The second task is to speed up the more aggressive simplification. Currently it is slower than a naive simplifiction(the naive version as implemented in ADU of course can explode in some cases).
 So it needs to be explored how to make it faster. Our possibility would be to explore again the connection to DFAs. This is very much work in progress.

changeset 50	866eda9ba66a
parent 49	d256aabe88f3
child 51	5df7faf69238