handouts/ho04.tex
changeset 282 3e3b927a85cf
parent 251 5b5a68df6d16
child 283 c14e5ebf0c3b
equal deleted inserted replaced
281:314d5979b4ce 282:3e3b927a85cf
     1 \documentclass{article}
     1 \documentclass{article}
     2 \usepackage{../style}
     2 \usepackage{../style}
     3 \usepackage{../langs}
     3 \usepackage{../langs}
     4 
     4 \usepackage{../graphics}
     5 \usepackage{xcolor}
       
     6 \usepackage{tikz}
       
     7 \usetikzlibrary{arrows}
       
     8 \usetikzlibrary{automata}
       
     9 \usetikzlibrary{shapes}
       
    10 \usetikzlibrary{shadows}
       
    11 \usetikzlibrary{positioning}
       
    12 \usetikzlibrary{calc}
       
    13 \usetikzlibrary{fit}
       
    14 \usetikzlibrary{backgrounds}
       
    15 
     5 
    16 
     6 
    17 \begin{document}
     7 \begin{document}
    18 
     8 
    19 \section*{Handout 4}
     9 \section*{Handout 4 (Sulzmann \& Lu Algorithm)}
       
    10 
       
    11 So far our algorithm based on derivatives was only able to say
       
    12 yes or no depending on whether a string was matched by regular
       
    13 expression or not. Often a more interesting question is to
       
    14 find out \emph{how} a regular expression matched a string?
       
    15 Answering this question will help us with the problem we are
       
    16 after, namely tokenising an input string, that is splitting it
       
    17 up into its ``word'' components. The algorithm we will be
       
    18 looking at was designed by Sulzmann \& Lu in a rather recent
       
    19 paper. A link to it is provided at KEATS, in case you are
       
    20 interested.\footnote{In my humble opinion this is an
       
    21 interesting instance of the research literature: it contains a
       
    22 very neat idea, but its presentation is rather sloppy.
       
    23 Students and I found several rather annoying typos in their
       
    24 examples and definitions.}
       
    25 
       
    26 In order to give an answer for how a regular expression matched
       
    27 a string, Sulzmann and Lu introduce \emph{values}. A value 
       
    28 will be the output of the algorithm whenever the regular 
       
    29 expression matches the string. If not an error will be raised.
       
    30 Since the first phase of the algorithm is identical to the
       
    31 derivative based matcher from the first coursework, the 
       
    32 function $nullable$ will be used to decide whether as string
       
    33 is matched by a regular expression.
       
    34 
       
    35 
       
    36 
    20 
    37 
    21 Algorithm by Sulzmann, Lexing
    38 Algorithm by Sulzmann, Lexing
    22 
    39 
    23 \end{document}
    40 \end{document}
    24 
    41