--- a/etnms/etnms.tex Tue Mar 03 21:56:09 2020 +0000
+++ b/etnms/etnms.tex Wed Mar 04 08:25:38 2020 +0000
@@ -795,10 +795,10 @@
\noindent
whereby $\blexers$ simplifies (makes derivatives smaller) in each
-step, whereas with $\blexer$ the size can grow exponentially. This
+step. This
would be an important milestone for my thesis, because we already
have a very good idea how to establish that our set our simplification
-rules keeps the size of derivativs below a relatively tight bound.
+rules keep the size of derivatives below a relatively tight bound.
In order to prove the main theorem in \eqref{mainthm}, we need to prove that the
two functions produce the same output. The definition of these two functions
@@ -889,7 +889,7 @@
\noindent
which established that the bit-sequence algorithm produces the same
result as the original algorithm, which does not use
-bit-sequence.
+bit-sequences.
The proof uses two ``tricks''. One is that it uses a \flex-function
\begin{center}
@@ -1017,28 +1017,28 @@
\end{center}
The idea is that using $v'$, a simplified version of $v$ that had gone
through the same simplification step as $\textit{simp}(a)$, we are able
-to extract the bitcode that gives the same parsing information as the
+to extract the bitcode that gives the same lexing information as the
unsimplified one.
If we want to use a similar technique as
that of the existing proof,
we face the problem that in the above
equalities,
$\retrieve \; a \; v$ is not always defined.
-for example,
+For example,
$\retrieve \; _0(_1a+_0a) \; \Left(\Empty)$
is defined, but not $\retrieve \; (_{01}a) \;\Left(\Empty)$,
though we can extract the same POSIX
bits from the two annotated regular expressions.
The latter might occur when we try to retrieve from
a simplified regular expression using the same value
-as the unsimplified one.
+underlying the unsimplified one.
This is because $\Left(\Empty)$ corresponds to
the regular expression structure $\ONE+r_2$ instead of
$\ONE$.
That means, if we
want to prove that
\begin{center}
-$\textit{decode} \; \bmkeps \; \rup\backslash s \; r = \textit{decode} \; \bmkeps \; \rup\backslash_{simp} s \; r$
+$\bmkeps \; \rup\backslash s = \bmkeps \; \rup\backslash_{simp} s$
\end{center}
\noindent
holds by using $\retrieve$,
@@ -1051,7 +1051,7 @@
$f$ rectifies $r\backslash s$ so the value $\mkeps(f(r\backslash s))$ becomes
something simpler
to make the retrieve function defined.\\
-\subsubsection{Ways to Rectify Value}
+\subsubsection{Ways to Rectify Values}
One way to do this is to prove the following:
\begin{center}
$\retrieve \; \rup\backslash_{simp} s \; \mkeps(\simp(r\backslash s))=\textit{retrieve} \; \rup\backslash s \; \mkeps(r\backslash s)$
@@ -1083,11 +1083,6 @@
\begin{center}
$\simp(\rup\backslash s)$ is equal to $(_{00}\ONE +_{011}a^*)$
\end{center}
-\noindent
-(For the sake of visual simplicity, we use numbers to denote the bits
-in bitcodes as we have previously defined for annotated
-regular expressions. $\S$ is replaced by
-subscript $_1$ and $\Z$ by $_0$.)
What makes the difference?
@@ -1177,7 +1172,7 @@
\noindent
The outmost bit $_0$ stays with
-the outmost regular expression, rather than being fused to
+the outermost regular expression, rather than being fused to
its child regular expressions, as what we will later see happens
to $\simp(\rup\backslash \, s)$.
If we choose to not simplify in the midst of derivative operations,
@@ -1194,7 +1189,7 @@
& &$(\textit{fuse}\,(\textit{bmkeps}\,a_1)\,(a_2\,\backslash c))]$\\
\end{tabular}
\end{center}
-
+\noindent
because
$\rup\backslash a = (_0\ONE + \ZERO)(_0a + _1a^*)$
is a sequence
@@ -1231,7 +1226,7 @@
\end{center}
We take a pair $(r, \;s)$ from the set $D$.
-Now we compute ${\bf \rup \backslash_{simp} s}$, we get:
+Now we compute $ \rup \backslash_{simp} s$, we get:
\begin{center}
\begin{tabular}{lcl}
$(r_1\cdot r_2)\backslash_{simp} \, [c_1c_2]$ & $= \simp\left[ \big(\simp\left[ \left( r_1\cdot r_2 \right) \backslash c_1\right] \big)\backslash c_2\right]$\\
@@ -1437,7 +1432,7 @@
-We have changed the algorithm to suppress the old
+We have changed the algorithm to avoid the old
counterexample, but this gives rise to new counterexamples.
This dilemma causes this amendment not a successful
attempt to make $\rup\backslash_{simp} \, s = \simp(\rup\backslash s)$
@@ -1472,10 +1467,10 @@
\end{center}
\item
\begin{center}
-$\textit{if} r = \simp(r') \textit{then} \; \textit{good}(r) $
+$\textit{if}\; r = \simp(r') \textit{then} \; \textit{good}(r) $
\end{center}
\end{itemize}
-\subsection{the Contains relation}
+\subsection{The Contains relation}
$\retrieve$ is a too strong relation in that
it only extracts one bitcode instead of a set of them.
Therefore we try to define another relation(predicate)
@@ -1637,8 +1632,8 @@
it simulates the lexing process with respect to different strings.
Our hope is that using $\gg$ we can prove the bits
-information are not lost when we simplify a regular expression,
-so we need to relate $\gg$ with simplifcation, for example,
+information are not lost when we simplify a regular expression.
+So we need to relate $\gg$ with simplifcation, for example,
one of the lemmas we have proved about $\gg$ is that
\item
\begin{center}
@@ -1676,7 +1671,7 @@
is still not clear. It is one of the next steps we need to
work on.
-\subsection{the $\textit{ders}_2$ Function}
+\subsection{The $\textit{ders}_2$ Function}
If we want to prove the result
\begin{center}
$ \textit{blexer}\_{simp}(r, \; s) = \textit{blexer}(r, \; s)$
@@ -1695,7 +1690,7 @@
It is based on the observation that the derivative of $r_1 \cdot r_2$
with respect to a string $s$ can actually be written in an "explicit form"
composed of $r_1$ and $r_2$'s derivatives.
-For example, we can look at how $r1\cdot r2$ expands
+For example, we can look at how $r_1\cdot r_2$ expands
when being derived with a two-character string:
\begin{center}
\begin{tabular}{lcl}
@@ -1727,25 +1722,24 @@
\end{tabular}
\end{center}
We have formalized and proved the correctness of this
-alternative definition of derivative and call it $\textit{ders2}$ to
+alternative definition of derivative and call it $\textit{ders}_2$ to
make a distinction of it with the $\textit{ders}$-function.
Note this differentiates from the lexing algorithm in the sense that
it calculates the results $r_1\backslash s_i , r_2 \backslash s_j$ first
and then glue them together
-into nested alternatives whereas the $r_1 \cdot r_2 \backslash s$ procedure,
-used by algorithm $\lexer$, can only produce each element of the list
+into nested alternatives.
+$\lexer$, on the other hand, can only produce each element of the list
in the resulting alternatives regular expression
-altogether rather than
-generating each of the children nodes
-in a single recursive call that is only for generating that
-very expression itself.
-$\lexer$ does lexing in a "breadth first" manner whereas
-$\textit{ders2}$ does it in a "depth first" manner.
+altogether in the last derivative step.
+$\lexer$ does lexing in a "breadth first" manner,
+generating all the children nodes simultaneously
+whereas
+$\textit{ders}_2$ does it in a "depth first" manner.
Using this intuition we can also define the annotated regular expression version of
-derivative and call it $\textit{bders2}$ and prove the equivalence with $\textit{bders}$.
+derivative and call it $\textit{bders}_2$ and prove the equivalence with $\textit{bders}$.
Our hope is to use this alternative definition as a guide
for our induction.
-Using $\textit{bders2}$ we have a clearer idea
+Using $\textit{bders}_2$ we have a clearer idea
of what $r\backslash s$ and $\simp(r\backslash s)$ looks like.
\section{Conclusion}
Under the exhaustive tests we believe the main
@@ -1760,7 +1754,7 @@
regular expression that is simplified at the final moment.
We are almost there, but a last step is needed to make the proof work.
Hopefully in the next few weeks we will be able to find one.
-
+This would be an important milestone for my dissertation.
\bibliographystyle{plain}
\bibliography{root}