lexing: comparison thys/Paper/Paper.thy

equal deleted inserted replaced

-:f1d800062d4f
+:e85d10b238d0
 r)"}} is nullable iff @{term r} is, and so on. The purpose of the record
 regular expression is to mark certain parts of a regular expression and
 then record in the calculated value which parts of the string were matched
 by this part. The label can then serve as classification for the tokens.
 For this recall the regular expression @{text "(r\<^bsub>key\<^esub> + r\<^bsub>id\<^esub>)\<^sup>\<star>"} for
-keywords and identifiers from the Introduction. With record regular
+keywords and identifiers from the Introduction. With the record regular
 expression we can form \mbox{@{text "((key : r\<^bsub>key\<^esub>) + (id : r\<^bsub>id\<^esub>))\<^sup>\<star>"}}
 and then traverse the calculated value and only collect the underlying
 strings in record values. With this we obtain finite sequences of pairs of
 labels and strings, for example
 steps.
 While the simplification of regular expressions according to
 rules like
-\begin{center}
+\begin{equation}\label{Simpl}
-\begin{tabular}{lcl}
+\begin{array}{lcl}
 @{term "ALT ZERO r"} & @{text "\<Rightarrow>"} & @{term r}\\
 @{term "ALT r ZERO"} & @{text "\<Rightarrow>"} & @{term r}\\
 @{term "SEQ ONE r"}  & @{text "\<Rightarrow>"} & @{term r}\\
 @{term "SEQ r ONE"}  & @{text "\<Rightarrow>"} & @{term r}
-\end{tabular}
+\end{array}
-\end{center}
+\end{equation}
 \noindent is well understood, there is an obstacle with the POSIX value
 calculation algorithm by Sulzmann and Lu: if we build a derivative regular
 expression and then simplify it, we will calculate a POSIX value for this
 simplified derivative regular expression, \emph{not} for the original (unsimplified)
 \end{tabular}
 \end{center}
 \noindent
 The functions @{text "simp\<^bsub>Alt\<^esub>"} and @{text "simp\<^bsub>Seq\<^esub>"} encode the simplification rules
-and compose the rectification functions. The main simplification function is then
+in \eqref{Simpl} and compose the rectification functions (simplifications can occur
+deep inside the regular expression). The main simplification function is then
 \begin{center}
 \begin{tabular}{lcl}
 @{term "simp (ALT r\<^sub>1 r\<^sub>2)"} & $\dn$ & @{term "simp_ALT (simp r\<^sub>1) (simp r\<^sub>2)"}\\
 @{term "simp (SEQ r\<^sub>1 r\<^sub>2)"} & $\dn$ & @{term "simp_SEQ (simp r\<^sub>1) (simp r\<^sub>2)"}\\
 \noindent
 In the second clause we first calculate the derivative @{term "der c r"}
 and then simplify the result. This gives us a simplified derivative
 @{text "r\<^sub>s"} and a rectification function @{text "f\<^sub>r"}. The lexer
 is then recursively called with the simplified derivative, but before
-we inject the character @{term c} into the value, we need to rectify
+we inject the character @{term c} into the value @{term v}, we need to rectify
-it (that is construct @{term "f\<^sub>r v"}). We can prove that
+@{term v} (that is construct @{term "f\<^sub>r v"}). We can prove that
 \begin{theorem}
 @{thm slexer_correctness}
 \end{theorem}
 \url{http://www.inf.kcl.ac.uk/staff/urbanc/lex}.\medskip
 \noindent
 {\bf Acknowledgements:}
 We are very grateful to Martin Sulzmann for his comments on our work and
-also patiently explaining to us the details in \cite{Sulzmann2014}.
+moreover patiently explaining to us the details in \cite{Sulzmann2014}. We
+also received very helpful comments from James Cheney and anonymous referees.
 \small
 \bibliographystyle{plain}
 \bibliography{root}
 *}

changeset 177	e85d10b238d0
parent 176	f1d800062d4f
child 178	2835d13be702