lexing: comparison thys2/Paper/Paper.thy

equal deleted inserted replaced

-:0cce1aee0fb2
+:222333d2bdc2
 \noindent
 There is also an \emph{erase}-function, written $r^\downarrow$, which
 transforms a bitcoded regular expression into a (standard) regular
 expression by just erasing the annotated bitsequences. We omit the
 straightforward definition. For defining the algorithm, we also need
-the functions \textit{bnullable} and \textit{bmkeps}, which are the
+the functions \textit{bnullable} and \textit{bmkeps}(\textit{s}), which are the
 ``lifted'' versions of \textit{nullable} and \textit{mkeps} acting on
 bitcoded regular expressions.
 %
 \begin{center}
 \begin{tabular}{@ {}c@ {}c@ {}}
 $\textit{true}$
 \end{tabular}
 &
 \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1mm}}l@ {}}
 $\textit{bmkeps}\,(\textit{ONE}\,bs)$ & $\dn$ & $bs$\\
-$\textit{bmkeps}\,(\textit{ALTs}\,bs\,r\!::\!\rs)$ & $\dn$ &
+$\textit{bmkeps}\,(\textit{ALTs}\,bs\,\rs)$ & $\dn$ &
-$\textit{if}\;\textit{bnullable}\,r$\\
+$bs\,@\,\textit{bmkepss}\,\rs$\\
-& &$\textit{then}\;bs\,@\,\textit{bmkeps}\,r$\\
-& &$\textit{else}\;bs\,@\,\textit{bmkeps}\,\rs$\\
 $\textit{bmkeps}\,(\textit{SEQ}\,bs\,r_1\,r_2)$ & $\dn$ &\\
 \multicolumn{3}{r}{$bs \,@\,\textit{bmkeps}\,r_1\,@\, \textit{bmkeps}\,r_2$}\\
 $\textit{bmkeps}\,(\textit{STAR}\,bs\,r)$ & $\dn$ &
-$bs \,@\, [\S]$
+$bs \,@\, [\S]$\\
+$\textit{bmkepss}\,(r\!::\!\rs)$ & $\dn$ &
+$\textit{if}\;\textit{bnullable}\,r$\\
+& &$\textit{then}\;\textit{bmkeps}\,r$\\
+& &$\textit{else}\;\textit{bmkepss}\,\rs$
 \end{tabular}
 \end{tabular}
 \end{center}
 \end{center}
 \noindent where we scan the list from left to right (because we
 have to remove later copies). In @{text distinctBy}, @{text f} is a
 function and @{text acc} is an accumulator for regular
-expressions---essentially a set of regular expression that we have already seen
+expressions---essentially a set of regular expressions that we have already seen
 while scanning the list. Therefore we delete an element, say @{text x},
 from the list provided @{text "f x"} is already in the accumulator;
 otherwise we keep @{text x} and scan the rest of the list but
 add @{text "f x"} as another ``seen'' element to @{text acc}. We will use
 @{term distinctBy} where @{text f} is the erase functions, @{term "erase (DUMMY)"},
 We found it more helpful to introduce the rewriting systems shown in
 Figure~\ref{SimpRewrites}. The idea is to generate
 simplified regular expressions in small steps (unlike the @{text bsimp}-function which
 does the same in a big step), and show that each of
 the small steps preserves the bitcodes that lead to the final POSIX value.
-The rewrite system is organised such that $\leadsto$ is for bitcode regular
+The rewrite system is organised such that $\leadsto$ is for bitcoded regular
 expressions and $\stackrel{s}{\leadsto}$ for lists of bitcoded regular
 expressions. The former essentially implements the simplifications of
 @{text "bsimpSEQ"} and @{text flts}; while the latter implements the
 simplifications in @{text "bsimpALTs"}. We can show that any bitcoded
-regular expression reduces in one or more steps to the simplified
+regular expression reduces in zero or more steps to the simplified
 regular expression generated by @{text bsimp}:
 \begin{lemma}\label{lemone}
 @{thm[mode=IfThen] rewrites_to_bsimp}
 \end{lemma}
 section {* Finiteness of Derivatives *}
 text {*
-In this section let us sketch our argument on why the size of the simplified
+In this section let us sketch our argument for why the size of the simplified
 derivatives with the aggressive simplification function is finite. Suppose
-we have a size functions for bitcoded regular expressions, written
+we have a size function for bitcoded regular expressions, written
 @{text "|r|"}, which counts the number of nodes if we regard $r$ as a tree
 (we omit the precise definition). For this we show that for every $r$
 there exists a bound $N$
 such that
 equivalent of @{term blexer_simp} ``...we can incrementally compute
 bitcoded parse trees in linear time in the size of the input''
 \cite[Page 14]{Sulzmann2014}.
 Given the growth of the
 derivatives in some cases even after aggressive simplification, this
-is a hard to believe fact. A similar claim of about a theoretical runtime
+is a hard to believe fact. A similar claim about a theoretical runtime
 of @{text "O(n\<^sup>2)"} is made for the Verbatim lexer, which calculates POSIX matches and is based on
 derivatives \cite{verbatim}. In this case derivatives are not simplified.
 Clearly our result of having finite
 derivatives is rather weak in this context but we think such effeciency claims
 require further scrutiny.\medskip

changeset 436	222333d2bdc2
parent 426	5b77220fdf01
child 458	30c91ea7095b