# HG changeset patch # User Christian Urban # Date 1413457179 -3600 # Node ID 3e3b927a85cfde1a0a693ac36a9deeaddf150604 # Parent 314d5979b4ce6ed867f60d2d4480b8e77b82d875 added ho04 diff -r 314d5979b4ce -r 3e3b927a85cf handouts/ho04.pdf Binary file handouts/ho04.pdf has changed diff -r 314d5979b4ce -r 3e3b927a85cf handouts/ho04.tex --- a/handouts/ho04.tex Mon Oct 13 08:43:51 2014 +0100 +++ b/handouts/ho04.tex Thu Oct 16 11:59:39 2014 +0100 @@ -1,22 +1,39 @@ \documentclass{article} \usepackage{../style} \usepackage{../langs} - -\usepackage{xcolor} -\usepackage{tikz} -\usetikzlibrary{arrows} -\usetikzlibrary{automata} -\usetikzlibrary{shapes} -\usetikzlibrary{shadows} -\usetikzlibrary{positioning} -\usetikzlibrary{calc} -\usetikzlibrary{fit} -\usetikzlibrary{backgrounds} +\usepackage{../graphics} \begin{document} -\section*{Handout 4} +\section*{Handout 4 (Sulzmann \& Lu Algorithm)} + +So far our algorithm based on derivatives was only able to say +yes or no depending on whether a string was matched by regular +expression or not. Often a more interesting question is to +find out \emph{how} a regular expression matched a string? +Answering this question will help us with the problem we are +after, namely tokenising an input string, that is splitting it +up into its ``word'' components. The algorithm we will be +looking at was designed by Sulzmann \& Lu in a rather recent +paper. A link to it is provided at KEATS, in case you are +interested.\footnote{In my humble opinion this is an +interesting instance of the research literature: it contains a +very neat idea, but its presentation is rather sloppy. +Students and I found several rather annoying typos in their +examples and definitions.} + +In order to give an answer for how a regular expression matched +a string, Sulzmann and Lu introduce \emph{values}. A value +will be the output of the algorithm whenever the regular +expression matches the string. If not an error will be raised. +Since the first phase of the algorithm is identical to the +derivative based matcher from the first coursework, the +function $nullable$ will be used to decide whether as string +is matched by a regular expression. + + + Algorithm by Sulzmann, Lexing