cst_tests: changeset 142:f2aa71c76cba

--- a/etnms/etnms.tex	Mon Mar 02 21:45:24 2020 +0000
+++ b/etnms/etnms.tex	Mon Mar 02 22:13:37 2020 +0000
@@ -1736,22 +1736,36 @@
 	&& $\textit{else if} \; \nullable(r_1) \textit{and} \;\textit{not} \; \nullable(r_1 \backslash c_1)\; \textit{then} \;
 	((r_1\backslash c_1c_2)\cdot r_2 + r_2 \backslash c_1c_2)$\\
 	&& $\textit{else} \;(r_1\backslash c_1c_2) \cdot r_2$\\
-	& $=$ & $\sum\limits{s_i }{r_1 \backslash s_i \cdot r_2 \backslash s_j} \; \text{where} \; s_i \; \text{is} \; \text{true prefix}\;  \text{of} \; s \;\text{and} \; s_i @s_j = s \; \text{and} \;\nullable(r_1\backslash s_i)$
+	& $=$ & $(r_1\backslash s) \cdot r_2 + \sum\limits_{s_j=c_2, c_1c_2 }{r_2 \backslash s_j} \; \text{where} \; \nullable(r_1\backslash s_i) \; \text{and} \;s_i @s_j = [c_1c_2]$
+\end{tabular}
+\end{center}
+In a more general form, 
+\begin{center}
+\begin{tabular}{lcl}
+	$(r_1 \cdot r_2) \backslash s $ & $=$ & $(r_1\backslash s) \cdot r_2 + \sum\limits_{s_i }{r_2 \backslash s_j} \; \text{where} \; s_i \; \text{is} \; \text{true prefix}\;  \text{of} \; s \;\text{and} \; s_i @s_j = s \; \text{and} \;\nullable(r_1\backslash s_i)$
 \end{tabular}
 \end{center}
-In a more serious manner, we should write
-\begin{center}
-	$ (r_1 \cdot r_2) \backslash [c_1c_2] =  \sum{r_1 \backslash s_i \cdot r_2 \backslash s_j}$
-\end{center}
-Note this differentiates from the previous form in the sense that
-it calculates the results $r_1\backslash s_i , r_2 \backslash s_j$ first and then glue them together
-through nested alternatives whereas the $r_1 \cdot r_2 \backslash s$ procedure,
-used by algorithm $\lexer$, can only produce each component of the 
-resulting alternatives regular expression altogether rather than 
-generating each of the children nodes one by one
-n a single recursive call that is only for generating that
+We have formalized and proved the correctness of this 
+alternative definition of derivative and call it $\textit{ders2}$ to
+make a distinction of it with the $\textit{ders}$-function.
+Note this differentiates from the lexing algorithm  in the sense that
+it calculates the results $r_1\backslash s_i , r_2 \backslash s_j$ first 
+and then glue them together
+into nested alternatives whereas the $r_1 \cdot r_2 \backslash s$ procedure,
+used by algorithm $\lexer$, can only produce each element of the list
+in the resulting alternatives regular expression 
+altogether rather than 
+generating each of the children nodes 
+in a single recursive call that is only for generating that
 very expression itself.
-We have this 
+$\lexer$ does lexing in a "breadth first" manner whereas
+$\textit{ders2}$ does it in a "depth first" manner.
+Using this intuition we can also define the annotated regular expression version of 
+derivative and call it $\textit{bders2}$ and prove the equivalence with $\textit{bders}$.
+Our hope is to use this alternative definition as a guide
+for our induction.
+Using $\textit{bders2}$ we have a clearer idea
+of what $r\backslash s$ and $\simp(r\backslash s)$ looks like.
 \section{Conclusion}
 Under the exhaustive tests we believe the main
 result holds, yet a proof still seems elusive.
@@ -1765,181 +1779,6 @@
 regular expression that is simplified at the final moment.
 We are almost there, but a last step is needed to make the proof work.
 Hopefully in the next few weeks we will be able to find one.
-%CONSTRUCTION SITE HERE
-that is to say, despite the bits are being moved around on the regular expression
-(difference in bits), the structure of the (unannotated)regular expression
-after one simplification is exactly the same after the 
-same sequence of derivative operations 
-regardless of whether we did simplification
-along the way.
- One way would be to give a function that calls
-
-fuse is the culprit: it causes the order in which alternatives
-are opened up to be remembered and finally the difference
-appear in $\simp(\rup \backslash s)$ and $\rup \backslash{simp} \,s$.
-but we have to use them as they are essential in the simplification:
-flatten needs them.
-
-
-
-However, without erase the above equality does not hold:
-for the regular expression  
-$(a+b)(a+a*)$,
-if we do derivative with respect to string $aa$,
-we get 
-
-\subsection{Another Proof Strategy}
-sdddddr does not equal sdsdsdsr sometimes.\\
-For example,
-
-This equicalence class method might still have the potential of proving this,
-but not yet
-i parallelly tried another method of using retrieve\\
-
-
-The vsimp function, defined as follows
-tries to simplify the value in lockstep with 
-regular expression:\\
-
-
-The problem here is that 
-
-we used retrieve for the key induction:
-$decode (retrieve (r\backslash (s @ [c])) v) r $
-$decode (retrieve (r\backslash s) (inj (r\backslash s) c v)) r$
-Here, decode recovers a value that corresponds to a match(possibly partial)
-from bits, and the bits are extracted by retrieve,
-and the key value $v$ that guides retrieve is
-$mkeps r\backslash s$, $inj r c (mkeps r\backslash s)$, $inj (inj (v))$, ......
-if we can 
-the problem is that 
-need vsiimp to make a value that is suitable for decoding
-$Some(flex rid(s@[c])v) = Some(flex rids(inj (r\backslash s)cv))$
-another way that christian came up with that might circumvent the 
-prblem of finding suitable value is by not stating the visimp
-function but include all possible value in a set that a regex is able to produce,
-and proving that both r and sr are able to produce the bits that correspond the POSIX value
-
-produced by feeding the same initial regular expression $r$ and string $s$ to the
- two functions $ders$ and $ders\_simp$.
-The reason why
-Namely, if $bmkeps( r_1) = bmkeps(r_2)$, then we 
-
-
-If we define the equivalence relation $\sim_{m\epsilon}$ between two regular expressions
-$r_1$ and $r_2$as follows:
-$r_1 \sim_{m\epsilon} r_2  \iff bmkeps(r_1)= bmkeps(r_2)$
-(in other words, they $r1$ and $r2$ produce the same output under the function $bmkeps$.)
-Then the first goal 
-might be restated as 
-$(r^\uparrow)\backslash_{simp}\, s  \sim_{m\epsilon} (r^\uparrow)\backslash s$.
-I tried to establish an equivalence relation between the regular experssions 
-like dddr dddsr,.....
-but right now i am only able to establish dsr and dr, using structural induction on r.
-Those involve multiple derivative operations are harder to prove.
-Two attempts have been made:
-(1)induction on the number of der operations(or in other words, the length of the string s),
-the inductive hypothesis was initially specified as 
-"For an arbitrary regular expression r, 
-For all string s in the language of r whose length do not exceed 
-the number n, ders s r me derssimp s r"
-and the proof goal may be stated as
-"For an arbitrary regular expression r, 
-For all string s in the language of r whose length do not exceed 
-the number n+1, ders s r me derssimp s r"
-the problem here is that although we can easily break down
-a string s of length n+1 into s1@list(c), it is not that easy
-to use the i.h. as a stepping stone to prove anything because s1 may well be not
-in the language L(r). This inhibits us from obtaining the fact that
-ders s1 r me derssimps s1 r.
-Further exploration is needed to amend this hypothesis so it includes the
-situation when s1 is not nullable.
-For example, what information(bits? 
-values?) can be extracted
-from the regular expression ders(s1,r) so that we can compute or predict the possible 
-result of bmkeps after another derivative operation. What function f can used to 
-carry out the task? The possible way of exploration can be 
-more directly perceived throught the graph below:
-find a function
-f
-such that
-f(bders s1 r)
-= re1
-f(bderss s1 r)
-= re2
-bmkeps(bders s r) = g(re1,c)
-bmkeps(bderssimp s r) = g(re2,c)
-and g(re1,c) = g(re2,c)
-The inductive hypothesis would be
-"For all strings s1 of length <= n, 
-f(bders s1 r)
-= re1
-f(bderss s1 r)
-= re2"
-proving this would be a lemma for the main proof:
-the main proof would be 
-"
-bmkeps(bders s r) = g(re1,c)
-bmkeps(bderssimp s r) = g(re2,c)
-for s = s1@c
-"
-and f need to be a recursive property for the lemma to be proved:
-it needs to store not only the "after one char nullable info",
-but also the "after two char nullable info",
-and so on so that it is able to  predict what f will compute after a derivative operation,
-in other words, it needs to be "infinitely recursive"\\
-To prove the lemma, in other words, to get
-"For all strings s1 of length <= n+1, 
-f(bders s1 r)
-= re3
-f(bderss s1 r)
-= re4"\\
-from\\
-"For all strings s1 of length <= n, 
-f(bders s1 r)
-= re1
-f(bderss s1 r)
-= re2"\\
-it might be best to construct an auxiliary function h such that\\
-h(re1, c) = re3\\
-h(re2, c) = re4\\
-and re3 = f(bder c (bders s1 r))\\
-re4 = f(simp(bder c (bderss s1 r)))
-The key point here is that we are not satisfied with what bders s r will produce under
-bmkeps, but also how it will perform after a derivative operation and then bmkeps, and two 
-derivative operations and so on. In essence, we are preserving the regular expression 
-itself under the function f, in a less compact way than the regluar expression: we are
-not just recording but also interpreting what the regular expression matches.
-In other words, we need to prove the properties of bderss s r beyond the bmkeps result,
-i.e., not just the nullable ones, but also those containing remaining characters.\\
-(2)we observed the fact that 
-erase sdddddr= erase sdsdsdsr
-that is to say, despite the bits are being moved around on the regular expression
-(difference in bits), the structure of the (unannotated)regular expression
-after one simplification is exactly the same after the 
-same sequence of derivative operations 
-regardless of whether we did simplification
-along the way.
-However, without erase the above equality does not hold:
-for the regular expression  
-$(a+b)(a+a*)$,
-if we do derivative with respect to string $aa$,
-we get
-%TODO
-sdddddr does not equal sdsdsdsr sometimes.\\
-For example,
-
-This equicalence class method might still have the potential of proving this,
-but not yet
-i parallelly tried another method of using retrieve\\
-
-
-
-\noindent\rule[0.5ex]{\linewidth}{1pt}
-
-
-
-
 
 
 \bibliographystyle{plain}
author	Chengsong
	Mon, 02 Mar 2020 22:13:37 +0000
changeset 142	f2aa71c76cba
parent 141	c8af86f7558e
child 143	5e47080a7164