handouts/ho04.tex
author Christian Urban <christian dot urban at kcl dot ac dot uk>
Thu, 16 Oct 2014 11:59:39 +0100
changeset 282 3e3b927a85cf
parent 251 5b5a68df6d16
child 283 c14e5ebf0c3b
permissions -rw-r--r--
added ho04
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
251
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     1
\documentclass{article}
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     2
\usepackage{../style}
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     3
\usepackage{../langs}
282
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
     4
\usepackage{../graphics}
251
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     5
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     6
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     7
\begin{document}
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     8
282
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
     9
\section*{Handout 4 (Sulzmann \& Lu Algorithm)}
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    10
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    11
So far our algorithm based on derivatives was only able to say
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    12
yes or no depending on whether a string was matched by regular
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    13
expression or not. Often a more interesting question is to
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    14
find out \emph{how} a regular expression matched a string?
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    15
Answering this question will help us with the problem we are
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    16
after, namely tokenising an input string, that is splitting it
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    17
up into its ``word'' components. The algorithm we will be
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    18
looking at was designed by Sulzmann \& Lu in a rather recent
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    19
paper. A link to it is provided at KEATS, in case you are
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    20
interested.\footnote{In my humble opinion this is an
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    21
interesting instance of the research literature: it contains a
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    22
very neat idea, but its presentation is rather sloppy.
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    23
Students and I found several rather annoying typos in their
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    24
examples and definitions.}
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    25
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    26
In order to give an answer for how a regular expression matched
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    27
a string, Sulzmann and Lu introduce \emph{values}. A value 
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    28
will be the output of the algorithm whenever the regular 
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    29
expression matches the string. If not an error will be raised.
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    30
Since the first phase of the algorithm is identical to the
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    31
derivative based matcher from the first coursework, the 
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    32
function $nullable$ will be used to decide whether as string
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    33
is matched by a regular expression.
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    34
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    35
3e3b927a85cf added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 251
diff changeset
    36
251
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    37
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    38
Algorithm by Sulzmann, Lexing
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    39
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    40
\end{document}
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    41
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    42
%%% Local Variables: 
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    43
%%% mode: latex
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    44
%%% TeX-master: t
5b5a68df6d16 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    45
%%% End: