author | Christian Urban <christian dot urban at kcl dot ac dot uk> |
Thu, 16 Oct 2014 11:59:39 +0100 | |
changeset 282 | 3e3b927a85cf |
parent 251 | 5b5a68df6d16 |
child 283 | c14e5ebf0c3b |
permissions | -rw-r--r-- |
251
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
1 |
\documentclass{article} |
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
2 |
\usepackage{../style} |
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
3 |
\usepackage{../langs} |
282
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
4 |
\usepackage{../graphics} |
251
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
5 |
|
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
6 |
|
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
7 |
\begin{document} |
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
8 |
|
282
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
9 |
\section*{Handout 4 (Sulzmann \& Lu Algorithm)} |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
10 |
|
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
11 |
So far our algorithm based on derivatives was only able to say |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
12 |
yes or no depending on whether a string was matched by regular |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
13 |
expression or not. Often a more interesting question is to |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
14 |
find out \emph{how} a regular expression matched a string? |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
15 |
Answering this question will help us with the problem we are |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
16 |
after, namely tokenising an input string, that is splitting it |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
17 |
up into its ``word'' components. The algorithm we will be |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
18 |
looking at was designed by Sulzmann \& Lu in a rather recent |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
19 |
paper. A link to it is provided at KEATS, in case you are |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
20 |
interested.\footnote{In my humble opinion this is an |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
21 |
interesting instance of the research literature: it contains a |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
22 |
very neat idea, but its presentation is rather sloppy. |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
23 |
Students and I found several rather annoying typos in their |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
24 |
examples and definitions.} |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
25 |
|
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
26 |
In order to give an answer for how a regular expression matched |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
27 |
a string, Sulzmann and Lu introduce \emph{values}. A value |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
28 |
will be the output of the algorithm whenever the regular |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
29 |
expression matches the string. If not an error will be raised. |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
30 |
Since the first phase of the algorithm is identical to the |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
31 |
derivative based matcher from the first coursework, the |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
32 |
function $nullable$ will be used to decide whether as string |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
33 |
is matched by a regular expression. |
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
34 |
|
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
35 |
|
3e3b927a85cf
added ho04
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
251
diff
changeset
|
36 |
|
251
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
37 |
|
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
38 |
Algorithm by Sulzmann, Lexing |
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
39 |
|
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
40 |
\end{document} |
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
41 |
|
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
42 |
%%% Local Variables: |
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
43 |
%%% mode: latex |
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
44 |
%%% TeX-master: t |
5b5a68df6d16
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
45 |
%%% End: |