etnms/etnms.tex
changeset 106 e0db3242d8b5
parent 105 317a7d54ebcc
parent 104 e7ba4da53930
child 107 b1e365afa29c
equal deleted inserted replaced
105:317a7d54ebcc 106:e0db3242d8b5
    98   negation and back-references.
    98   negation and back-references.
    99 \end{abstract}
    99 \end{abstract}
   100 
   100 
   101 \section{Introduction}
   101 \section{Introduction}
   102 
   102 
   103 While we believe derivatives of regular expressions is a beautiful
   103 While we believe derivatives of regular expressions, written
   104 concept (in terms of ease of implementing them in functional
   104 $r\backslash s$, are a beautiful concept (in terms of ease of
   105 programming languages and in terms of reasoning about them formally),
   105 implementing them in functional programming languages and in terms of
   106 they have one major drawback: every derivative step can make regular
   106 reasoning about them formally), they have one major drawback: every
   107 expressions grow drastically in size. This in turn has negative effect
   107 derivative step can make regular expressions grow drastically in
   108 on the runtime of the corresponding lexing algorithms. Consider for
   108 size. This in turn has negative effect on the runtime of the
   109 example the regular expression $(a+aa)^*$ and the short string
   109 corresponding lexing algorithms. Consider for example the regular
   110 $aaaaaaaaaaaa$. The corresponding derivative is already 8668 nodes
   110 expression $(a+aa)^*$ and the short string $aaaaaaaaaaaa$. The
   111 assuming the derivative is seen as a tree. The reason for the poor
   111 corresponding derivative contains already 8668 nodes where we assume
   112 runtime of the lexing algorithms is that they need to traverse such
   112 the derivative is given as a tree. The reason for the poor runtime of
   113 trees over and over again. The solution is to find a complete set of
   113 the derivative-based lexing algorithms is that they need to traverse
   114 simplification rules that keep the sizes of derivatives uniformly
   114 such trees over and over again. The solution is to find a complete set
       
   115 of simplification rules that keep the sizes of derivatives uniformly
   115 small.
   116 small.
   116 
   117 
   117 For reasons beyond this report, it turns out that a complete set of
   118 For reasons beyond this report, it turns out that a complete set of
   118 simplification rules depends on values being encoded as
   119 simplification rules depends on values being encoded as
   119 bitsequences.\footnote{Values are the results the lexing algorithms
   120 bitsequences.\footnote{Values are the results the lexing algorithms
   120   generate; they encode how a regular expression matched a string.} We
   121   generate; they encode how a regular expression matched a string.} We
   121 already know that the lexing algorithm using bitsequences but
   122 already know that the lexing algorithm using bitsequences but
   122 \emph{without} simplification is correct, albeilt horribly
   123 \emph{without} simplification is correct, albeilt horribly
   123 slow. Therefore in the past 6 months we were trying to prove that the
   124 slow. Therefore in the past 6 months I was trying to prove that the
   124 algorithm using bitsequences plus our simplification rules is
   125 algorithm using bitsequences plus our simplification rules is
   125 correct. Formally this amounts to show that
   126 also correct. Formally this amounts to show that
   126 
   127 
   127 \begin{equation}\label{mainthm}
   128 \begin{equation}\label{mainthm}
   128 \blexers \; r \; s = \blexer \;r\;s
   129 \blexers \; r \; s = \blexer \;r\;s
   129 \end{equation}
   130 \end{equation}
   130 
   131 
   131 \noindent
   132 \noindent
   132 whereby $\blexers$ simplifies (makes derivatives smaller) in each
   133 whereby $\blexers$ simplifies (makes derivatives smaller) in each
   133 step, whereas with $\blexer$ the size can grow exponentially. This
   134 step, whereas with $\blexer$ the size can grow exponentially. This
   134 would be an important milestone for the thesis, because we already
   135 would be an important milestone for my thesis, because we already
   135 have a very good idea how to establish that our set our simplification
   136 have a very good idea how to establish that our set our simplification
   136 rules keeps the size of derivativs below a relatively tight bound.
   137 rules keeps the size of derivativs below a relatively tight bound.
   137 
   138 
   138 In order to prove the main theorem in \eqref{mainthm}, we need to prove the
   139 In order to prove the main theorem in \eqref{mainthm}, we need to prove that the
   139 two functions produce the same output. The definition of these functions 
   140 two functions produce the same output. The definition of these two  functions 
   140 is shown below.
   141 is shown below.
   141 
   142 
   142 \begin{center}
   143 \begin{center}
   143 \begin{tabular}{lcl}
   144 \begin{tabular}{lcl}
   144   $\textit{blexer}\;r\,s$ & $\dn$ &
   145   $\textit{blexer}\;r\,s$ & $\dn$ &
   159 \end{tabular}
   160 \end{tabular}
   160 \end{center}
   161 \end{center}
   161 \noindent
   162 \noindent
   162 In these definitions $(r^\uparrow)$ is a kind of coding function that
   163 In these definitions $(r^\uparrow)$ is a kind of coding function that
   163 is the same in each case, similarly the decode and the \textit{bmkeps}
   164 is the same in each case, similarly the decode and the \textit{bmkeps}
   164 are functions that are the same in each case. Our main theorem
   165 are functions that are the same in each case. Our main
   165 \eqref{mainthm} therefore boils down to proving the following two
   166 theorem~\eqref{mainthm} therefore boils down to proving the following
   166 propositions (depending on which branch the if-else clause takes). It
   167 two propositions (depending on which branch the if-else clause
   167 establishes how the derivatives \emph{with} simplification do not
   168 takes). They establish how the derivatives \emph{with} simplification
   168 change the computed result:
   169 do not change the computed result:
   169 
   170 
   170 \begin{itemize}
   171 \begin{itemize}
   171 \item{(a)} If a string $s$ is in the language of $L(r)$, then \\
   172 \item{(a)} If a string $s$ is in the language of $L(r)$, then \\
   172 $\textit{bmkeps} (r^\uparrow)\backslash_{simp}\,s = \textit{bmkeps} (r^\uparrow)\backslash s$,\\
   173 $\textit{bmkeps} (r^\uparrow)\backslash_{simp}\,s = \textit{bmkeps} (r^\uparrow)\backslash s$,\\
   173 \item{(b)} If a string $s$ is in the language $L(r)$, then 
   174 \item{(b)} If a string $s$ is in the language $L(r)$, then 
   174 $\rup \backslash_{simp} \,s$ is not nullable.
   175 $\rup \backslash_{simp} \,s$ is not nullable.
   175 \end{itemize}
   176 \end{itemize}
   176 
   177 
   177 \noindent
   178 \noindent
   178 We have already proved in Isabelle the second part. This is actually
   179 We have already proved the second part  in Isabelle. This is actually
   179 not too difficult because we can show that simplification does not
   180 not too difficult because we can show that simplification does not
   180 change the language of simplified regular expressions.
   181 change the language of simplified regular expressions.
   181 
   182 
   182 If we can prove the first part, that is the bitsequence algorithm with
   183 If we can prove the first part, that is the bitsequence algorithm with
   183 simplification produces the same result as the one without
   184 simplification produces the same result as the one without
   206 %up it totally, that is why i believe the old 
   207 %up it totally, that is why i believe the old 
   207 %proof is fundamental in understanding
   208 %proof is fundamental in understanding
   208 %what we are doing in the past 6 months.
   209 %what we are doing in the past 6 months.
   209 %\bigskip\noindent\rule[1.5ex]{\linewidth}{5pt}
   210 %\bigskip\noindent\rule[1.5ex]{\linewidth}{5pt}
   210 
   211 
       
   212 \subsubsection*{Existing Proof}
       
   213 
   211 For this we have started with looking at the original proof that
   214 For this we have started with looking at the original proof that
   212 established that the bitsequence algorrithm produces the same result
   215 established that the bitsequence algorrithm produces the same result
   213 as the algorithm not using bitsequences. Formally this proof
   216 as the algorithm not using bitsequences. Formally this proof
   214 established
   217 established
   215 
   218 
   222 %\begin{center}
   225 %\begin{center}
   223 %$\blexer \; r^\uparrow \;s = \blexers \; r^\uparrow \;s$
   226 %$\blexer \; r^\uparrow \;s = \blexers \; r^\uparrow \;s$
   224 %\end{center}
   227 %\end{center}
   225 
   228 
   226 \noindent
   229 \noindent
   227 This proof used two ``tricks''. One is that it defined a \flex-function
   230 The proof uses two ``tricks''. One is that it uses a \flex-function
   228 
   231 
   229 \begin{center}
   232 \begin{center}
   230 \begin{tabular}{lcl}
   233 \begin{tabular}{lcl}
   231 $\textit{flex} \;r\; f\; (c\!::\!s) $ & $\dn$ & $\textit{flex} \;  (r\backslash c) \;(\lambda v. f (inj \; r \; c \; v)) \;s$ \\
   234 $\textit{flex} \;r\; f\; (c\!::\!s) $ & $\dn$ & $\textit{flex} \;  (r\backslash c) \;(\lambda v. f (inj \; r \; c \; v)) \;s$ \\
   232 $\textit{flex} \;r\; f\;  [\,] $ & $\dn$ & $f$
   235 $\textit{flex} \;r\; f\;  [\,] $ & $\dn$ & $f$
   233 \end{tabular}
   236 \end{tabular}
   234 \end{center}
   237 \end{center}
   235 
   238 
   236 \noindent
   239 \noindent
   237 and then proved for the right-hand side in \eqref{lexer}
   240 and then proves for the right-hand side in \eqref{lexer}
   238 
   241 
   239 \begin{center}
   242 \begin{center}
   240 $\lexer \;r\; s = \flex \;\textit{id} \; r\;s \;(\mkeps \; r\backslash s)$
   243 $\lexer \;r\; s = \flex \;\textit{id} \; r\;s \;(\mkeps \; (r\backslash s))$
   241 \end{center}.
   244 \end{center}.
   242 
   245 
   243 
   246 
   244 \noindent\rule[1.5ex]{\linewidth}{1pt}
   247 
   245 
   248 
   246 \noindent
   249 \noindent
   247 The $\flex$-function essentially does lexing by
   250 The $\flex$-function essentially does lexing by
   248 stacking up injection functions while doing derivatives.
   251 stacking up injection functions while doing derivatives.
   249 
   252 
   250 explicitly showing the order of characters being
   253 explicitly showing the order of characters being
   251 injected back in each step.
   254 injected back in each step.
   252 With $\flex$ we can write $\lexer$ this way: 
   255 With $\flex$ we can write $\lexer$ this way: 
       
   256 
   253 \begin{center}
   257 \begin{center}
   254 $\lexer \;r\; s = \flex \;id \; r\;s \;(\mkeps r\backslash s)$
   258 $\lexer \;r\; s = \flex \;id \; r\;s \;(\mkeps r\backslash s)$
   255 \end{center}
   259 \end{center}
   256 \noindent
   260 
   257 $\flex$ focuses on
   261 %\noindent
   258  the injections instead 
   262 %$\flex$ focuses on the injections instead of the derivatives ,
   259 of the derivatives ,
   263 %compared to the original definition of $\lexer$, which puts equal
   260 compared 
   264 %amount of emphasis on injection and derivative with respect to each
   261 to the original definition of $\lexer$,
   265 %character:
   262 which puts equal amount of emphasis on 
   266 
   263 injection and derivative with respect to each character:
   267 %\begin{center}
   264 \begin{center}
   268 %\begin{tabular}{lcl}
   265 \begin{tabular}{lcl}
   269 %$\textit{lexer} \; r\; (c\!::\!s) $ & $\dn$ & $\textit{case} \; \lexer \; (r\backslash c) \;s \; %\textit{of}$ \\
   266 $\textit{lexer} \; r\; (c\!::\!s) $ & $\dn$ & $\textit{case} \; \lexer \; (r\backslash c) \;s \; \textit{of}$ \\
   270 % & & $\textit{None} \; \Longrightarrow \; \textit{None}$\\
   267  & & $\textit{None} \; \Longrightarrow \; \textit{None}$\\
   271 %  & & $\textbar \; v \; \Longrightarrow \; \inj \; r\;c\;v$\\
   268   & & $\textbar \; v \; \Longrightarrow \; \inj \; r\;c\;v$\\
   272 %$\textit{lexer} \; r\;  [\,] $ & $\dn$ & $\textit{if} \; \nullable (r) \; \textit{then} \; \mkeps% (r) \; \textit{else} \;None$
   269 $\textit{lexer} \; r\;  [\,] $ & $\dn$ & $\textit{if} \; \nullable (r) \; \textit{then} \; \mkeps (r) \; \textit{else} \;None$
   273 %\end{tabular}
   270 \end{tabular}
   274 %\end{center}
   271 \end{center}
   275 
   272 \noindent
   276 \noindent
   273 Using this feature of $\flex$  we can rewrite the lexing
   277 The crux in the existing proof is how $\flex$ relates to injection, namely
   274 $w.r.t \; s @ [c]$ in term of lexing 
   278 
   275 $w.r.t \; s$:
       
   276 \begin{center}
   279 \begin{center}
   277 $\flex \; r \; id \; (s@[c]) \; v = \flex \;  r \; id \; s \; (inj \; (r\backslash s) \; c\; v)$.
   280 $\flex \; r \; id \; (s@[c]) \; v = \flex \;  r \; id \; s \; (inj \; (r\backslash s) \; c\; v)$.
   278 \end{center}
   281 \end{center}
   279 \noindent
   282 
   280 this allows us to use 
   283 \noindent
   281 the inductive hypothesis to get
   284 This property allows one to rewrite an induction hypothesis like 
       
   285 
   282 \begin{center} 
   286 \begin{center} 
   283 $ \flex \; r\; id\; (s@[c])\; v = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\;) r$
   287 $ \flex \; r\; id\; (s@[c])\; v = \textit{decode} \;( \textit{retrieve}\; (\rup \backslash s) \; (\inj \; (r\backslash s) \;c\;v)\;) r$
   284 \end{center}
   288 \end{center}
       
   289 
       
   290 \noindent\rule[1.5ex]{\linewidth}{4pt}
       
   291 There is no mention of retrieve yet .... this is the second trick in the
       
   292 existing proof. I am not sure whether you need to explain annotated regular
       
   293 expressions much earlier - maybe before the ``existing proof'' section, or
       
   294 evan earlier.
       
   295 
       
   296 \noindent\rule[1.5ex]{\linewidth}{4pt}
       
   297 
   285 \noindent
   298 \noindent
   286 By using a property of retrieve we have the $\textit{RHS}$ of the above equality is
   299 By using a property of retrieve we have the $\textit{RHS}$ of the above equality is
   287 $decode (retrieve (r^\uparrow \backslash(s @ [c])) v) r$, and this gives the 
   300 $decode (retrieve (r^\uparrow \backslash(s @ [c])) v) r$, and this gives the 
   288 main lemma result:
   301 main lemma result:
       
   302 
   289 \begin{center}
   303 \begin{center}
   290 $ \flex \;r\;  id \; (s@[c]) \; v =\textit{decode}(\textit{retrieve} (\rup \backslash (s@[c])) \;v) r$
   304 $ \flex \;r\;  id \; (s@[c]) \; v =\textit{decode}(\textit{retrieve} (\rup \backslash (s@[c])) \;v) r$
   291 \end{center}
   305 \end{center}
       
   306 
       
   307 
       
   308 
       
   309 
   292 \noindent
   310 \noindent
   293 To use this lemma result for our 
   311 To use this lemma result for our 
   294 correctness proof, simply replace the $v$ in the
   312 correctness proof, simply replace the $v$ in the
   295 $\textit{RHS}$ of the above equality with
   313 $\textit{RHS}$ of the above equality with
   296 $\mkeps\;(r\backslash (s@[c]))$, and apply the lemma that
   314 $\mkeps\;(r\backslash (s@[c]))$, and apply the lemma that