diff -r 92b3e287d87e -r eeff9953a1c1 slides04.tex --- a/slides04.tex Fri Oct 12 05:45:48 2012 +0100 +++ b/slides04.tex Sun Oct 14 23:41:49 2012 +0100 @@ -113,11 +113,23 @@ \item tokenization identifies lexeme in an input stream of characters (or string) and categorizes them into tokens -\item maximal munch rule +\item longest match rule (maximal munch rule): The +longest initial substring matched by any regular expression is taken +as next token. + +\item Rule priority: +For a particular longest initial substring, the first regular +expression that can match determines the token. + +\item problem with infix operations, for example i-12 \end{itemize} \url{http://www.technologyreview.com/tr10/?year=2011} +finite deterministic automata/ nondeterministic automaton + + + \end{frame}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%