added my comments
authorChristian Urban <urbanc@in.tum.de>
Wed, 15 Jan 2020 13:01:10 +0000
changeset 106 e0db3242d8b5
parent 105 317a7d54ebcc (diff)
parent 104 e7ba4da53930 (current diff)
child 107 b1e365afa29c
added my comments
etnms/etnms.tex
--- a/etnms/etnms.tex	Mon Jan 13 16:57:13 2020 +0000
+++ b/etnms/etnms.tex	Wed Jan 15 13:01:10 2020 +0000
@@ -91,49 +91,53 @@
   breaking its correctness. Being fast depends on a complete set of 
   simplification rules, some of which 
   have been put forward by Sulzmann and Lu. We have extended their
-  rules in order to obtain a tight bound on size of regular expressions.
-  We have tested the correctness of these extended rules, but have not 
-  formally established their correctness. We also have not yet looked 
+  rules in order to obtain a tight bound on the size of regular expressions.
+  We have tested these extended rules, but have not 
+  formally established their correctness. We have also not yet looked 
   at extended regular expressions, such as bounded repetitions,
   negation and back-references.
 \end{abstract}
 
 \section{Introduction}
 
-While we believe derivatives of regular expressions is a beautiful 
-concept (interms of ease to implementing them in functional programming
-language and ease to reason about them formally), they have one major 
-drawback: every derivative step can make regular expressions grow 
-drastically in size. This in turn has negative effects on the runtime of 
-the corresponding lexing algorithms. Consider for example the regular 
-expression $(a+aa)^*$ and the short string $aaaaaaaaaaaa$. The size of
-the corresponding derivative is already 8668 node assuming the derivatives
-is seen as a tree. The reason for the poor runtime of the lexing algorithms is
-that they need to traverse such trees over and over again. The solution is to
-find a complete set of simplification rules that keep the sizes of derivatives
-uniformly small.
+While we believe derivatives of regular expressions, written
+$r\backslash s$, are a beautiful concept (in terms of ease of
+implementing them in functional programming languages and in terms of
+reasoning about them formally), they have one major drawback: every
+derivative step can make regular expressions grow drastically in
+size. This in turn has negative effect on the runtime of the
+corresponding lexing algorithms. Consider for example the regular
+expression $(a+aa)^*$ and the short string $aaaaaaaaaaaa$. The
+corresponding derivative contains already 8668 nodes where we assume
+the derivative is given as a tree. The reason for the poor runtime of
+the derivative-based lexing algorithms is that they need to traverse
+such trees over and over again. The solution is to find a complete set
+of simplification rules that keep the sizes of derivatives uniformly
+small.
 
-For reasons beyond this report, it turns out that a complete set of 
-simplification rules depend on values being encoded as bitsequences.
-(Vlue are the results of the lexing algorithms generate; they encode how
-a regular expression matched a string.) We already know that the lexing 
-algorithm \emph{without} simplification is correct. Therefore in the
-past 6 months we were trying to prove  that the algorithm using bitsequences plus
-our simplification rules is correct. Formally this amounts to show that
+For reasons beyond this report, it turns out that a complete set of
+simplification rules depends on values being encoded as
+bitsequences.\footnote{Values are the results the lexing algorithms
+  generate; they encode how a regular expression matched a string.} We
+already know that the lexing algorithm using bitsequences but
+\emph{without} simplification is correct, albeilt horribly
+slow. Therefore in the past 6 months I was trying to prove that the
+algorithm using bitsequences plus our simplification rules is
+also correct. Formally this amounts to show that
 
 \begin{equation}\label{mainthm}
 \blexers \; r \; s = \blexer \;r\;s
 \end{equation}
 
 \noindent
-whereby $\blexers$ simplifies (makes derivatives smaller) in each step,
-whereas with $\blexer$ the size can grow exponentially. This would be an
-important milestone, because we already have a very good idea how to 
-establish that our set our simplification rules keeps the size below a
-relatively tight bound.
+whereby $\blexers$ simplifies (makes derivatives smaller) in each
+step, whereas with $\blexer$ the size can grow exponentially. This
+would be an important milestone for my thesis, because we already
+have a very good idea how to establish that our set our simplification
+rules keeps the size of derivativs below a relatively tight bound.
 
-In order to prove the main theorem \eqref{mainthm}, we need to prove the
-two functions produce the same output. The definition of these functions 
+In order to prove the main theorem in \eqref{mainthm}, we need to prove that the
+two functions produce the same output. The definition of these two  functions 
 is shown below.
 
 \begin{center}
@@ -156,43 +160,74 @@
 \end{tabular}
 \end{center}
 \noindent
-In these definitions $(r^\uparrow)$ is a kind of coding function that is the
-same in each case, similarly the decode and the \textit{bmkeps}
-functions. Our main theorem \eqref{mainthm} therefore boils down to
-proving the following two  propositions (depending on which branch the 
-if-else clause takes). It establishes how the derivatives  \emph{with}
-simplification do not change the computed result:
+In these definitions $(r^\uparrow)$ is a kind of coding function that
+is the same in each case, similarly the decode and the \textit{bmkeps}
+are functions that are the same in each case. Our main
+theorem~\eqref{mainthm} therefore boils down to proving the following
+two propositions (depending on which branch the if-else clause
+takes). They establish how the derivatives \emph{with} simplification
+do not change the computed result:
 
 \begin{itemize}
-\item{} If a string $s$ is in the language of $L(r)$, then \\
+\item{(a)} If a string $s$ is in the language of $L(r)$, then \\
 $\textit{bmkeps} (r^\uparrow)\backslash_{simp}\,s = \textit{bmkeps} (r^\uparrow)\backslash s$,\\
-\item{} If a string $s$ is in the language $L(r)$, then 
+\item{(b)} If a string $s$ is in the language $L(r)$, then 
 $\rup \backslash_{simp} \,s$ is not nullable.
 \end{itemize}
 
 \noindent
-We have already proved in Isabelle the second part. This is actually not 
-too difficult because we can show that simplification does not change
-the language of regular expressions. If we can prove the first case,
-that is the bitsequence algorithm with simplification produces the same 
-result as the one without simplification, then we are done.
-Unfortunately that part requires more effort, because simplification does not 
-only.need to \emph{not} change the language, but also not change
-the value (computed result).
+We have already proved the second part  in Isabelle. This is actually
+not too difficult because we can show that simplification does not
+change the language of simplified regular expressions.
+
+If we can prove the first part, that is the bitsequence algorithm with
+simplification produces the same result as the one without
+simplification, then we are done.  Unfortunately that part requires
+more effort, because simplification does not only need to \emph{not}
+change the language, but also not change the value (that is the
+computed result).
 
+%\bigskip\noindent\rule[1.5ex]{\linewidth}{5pt}
+%Do you want to keep this? You essentially want to say that the old
+%method used retrieve, which unfortunately cannot be adopted to 
+%the simplification rules. You could just say that and give an example.
+%However you have to think about how you give the example....nobody knows
+%about AZERO etc yet. Maybe it might be better to use normal regexes
+%like $a + aa$, but annotate bitsequences as subscript like $_1(_0a + _1aa)$.
 
+%\bigskip\noindent\rule[1.5ex]{\linewidth}{5pt}
+%REPLY:\\
+%Yes, I am essentially saying that the old method
+%cannot be adopted without adjustments.
+%But this does not mean we should skip
+%the proof of the bit-coded algorithm
+%as it is still the main direction we are looking into
+%to prove things. We are trying to modify
+%the old proof to suit our needs, but not give 
+%up it totally, that is why i believe the old 
+%proof is fundamental in understanding
+%what we are doing in the past 6 months.
+%\bigskip\noindent\rule[1.5ex]{\linewidth}{5pt}
+
+\subsubsection*{Existing Proof}
 
 For this we have started with looking at the original proof that
-established that the bitsequence algorithm produces the same result as
-the algorithm not using bitsequences. Formally this proof estabilshed
+established that the bitsequence algorrithm produces the same result
+as the algorithm not using bitsequences. Formally this proof
+established
 
-\begin{center}
-$\blexer \; r^\uparrow  s = \lexer \;r \;s$
-\end{center}
+\begin{equation}\label{lexer}
+\blexer \; (r^\uparrow)  s = \lexer \;r \;s
+\end{equation}
+
+%\noindent
+%might provide us insight into proving 
+%\begin{center}
+%$\blexer \; r^\uparrow \;s = \blexers \; r^\uparrow \;s$
+%\end{center}
 
 \noindent
-The proof used two ''tricks", One is that it defined
- a $\flex$-function
+The proof uses two ``tricks''. One is that it uses a \flex-function
 
 \begin{center}
 \begin{tabular}{lcl}
@@ -202,60 +237,76 @@
 \end{center}
 
 \noindent
-and then proved for the right-hand side in \eqref{lexer}
+and then proves for the right-hand side in \eqref{lexer}
 
 \begin{center}
-$\lexer \;r\; s = \flex \;\textit{id} \; r\;s \;(\mkeps \; r\backslash s)$
+$\lexer \;r\; s = \flex \;\textit{id} \; r\;s \;(\mkeps \; (r\backslash s))$
 \end{center}.
 
 
-\noindent\rule[1.5ex]{\linewidth}{1pt}
+
 
 \noindent
 The $\flex$-function essentially does lexing by
-stacking up injection functions while doing derivatives,
+stacking up injection functions while doing derivatives.
+
 explicitly showing the order of characters being
 injected back in each step.
 With $\flex$ we can write $\lexer$ this way: 
+
 \begin{center}
 $\lexer \;r\; s = \flex \;id \; r\;s \;(\mkeps r\backslash s)$
 \end{center}
+
+%\noindent
+%$\flex$ focuses on the injections instead of the derivatives ,
+%compared to the original definition of $\lexer$, which puts equal
+%amount of emphasis on injection and derivative with respect to each
+%character:
+
+%\begin{center}
+%\begin{tabular}{lcl}
+%$\textit{lexer} \; r\; (c\!::\!s) $ & $\dn$ & $\textit{case} \; \lexer \; (r\backslash c) \;s \; %\textit{of}$ \\
+% & & $\textit{None} \; \Longrightarrow \; \textit{None}$\\
+%  & & $\textbar \; v \; \Longrightarrow \; \inj \; r\;c\;v$\\
+%$\textit{lexer} \; r\;  [\,] $ & $\dn$ & $\textit{if} \; \nullable (r) \; \textit{then} \; \mkeps% (r) \; \textit{else} \;None$
+%\end{tabular}
+%\end{center}
+
 \noindent
-$\flex$ focuses on
- the injections instead 
-of the derivatives ,
-compared 
-to the original definition of $\lexer$,
-which puts equal amount of emphasis on 
-injection and derivative with respect to each character:
-\begin{center}
-\begin{tabular}{lcl}
-$\textit{lexer} \; r\; (c\!::\!s) $ & $\dn$ & $\textit{case} \; \lexer \; (r\backslash c) \;s \; \textit{of}$ \\
- & & $\textit{None} \; \Longrightarrow \; \textit{None}$\\
-  & & $\textbar \; v \; \Longrightarrow \; \inj \; r\;c\;v$\\
-$\textit{lexer} \; r\;  [\,] $ & $\dn$ & $\textit{if} \; \nullable (r) \; \textit{then} \; \mkeps (r) \; \textit{else} \;None$
-\end{tabular}
-\end{center}
-\noindent
-Using this feature of $\flex$  we can rewrite the lexing
-$w.r.t \; s @ [c]$ in term of lexing 
-$w.r.t \; s$:
+The crux in the existing proof is how $\flex$ relates to injection, namely
+
 \begin{center}
 $\flex \; r \; id \; (s@[c]) \; v = \flex \;  r \; id \; s \; (inj \; (r\backslash s) \; c\; v)$.
 \end{center}
+
 \noindent
-this allows us to use 
-the inductive hypothesis to get
+This property allows one to rewrite an induction hypothesis like 
+
 \begin{center} 
 $ \flex \; r\; id\; (s@[c])\; v = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\;) r$
 \end{center}
+
+\noindent\rule[1.5ex]{\linewidth}{4pt}
+There is no mention of retrieve yet .... this is the second trick in the
+existing proof. I am not sure whether you need to explain annotated regular
+expressions much earlier - maybe before the ``existing proof'' section, or
+evan earlier.
+
+\noindent\rule[1.5ex]{\linewidth}{4pt}
+
 \noindent
 By using a property of retrieve we have the $\textit{RHS}$ of the above equality is
 $decode (retrieve (r^\uparrow \backslash(s @ [c])) v) r$, and this gives the 
 main lemma result:
+
 \begin{center}
 $ \flex \;r\;  id \; (s@[c]) \; v =\textit{decode}(\textit{retrieve} (\rup \backslash (s@[c])) \;v) r$
 \end{center}
+
+
+
+
 \noindent
 To use this lemma result for our 
 correctness proof, simply replace the $v$ in the