diff -r 6e7c3db9023d -r 1be892087df2 handouts/ho02.tex --- a/handouts/ho02.tex Sat Oct 12 09:13:42 2013 +0100 +++ b/handouts/ho02.tex Sat Oct 12 10:12:38 2013 +0100 @@ -57,11 +57,12 @@ \] \noindent -Clearly we cannot use the function $L$ directly in order to solve this problem, because in general -the set of strings $L$ returns is infinite (recall what $L(a^*)$ is). In such cases there is no algorithm -then can test exhaustively, whether a string is member of this set. +we can look at an algorithm to solve this problem. +Clearly we cannot use the function $L$ directly for this, because in general +the set of strings $L$ returns is infinite (recall what $L(a^*)$ is). In such cases there is no way we can implement +an exhaustive test for whether a string is member of this set or not. -The algorithm we define below consists of two parts. One is the function $nullable$ which takes a +The algorithm we will define below consists of two parts. One is the function $nullable$ which takes a regular expression as argument and decides whether it can match the empty string (this means it returns a boolean). This can be easily defined recursively as follows: @@ -84,11 +85,11 @@ \] \noindent -On the left-hand side we have a function we can implement; on the right we have its specification. +Note on the left-hand side we have a function we can implement; on the right we have its specification. -The other function is calculating a \emph{derivative} of a regular expression. This is a function +The other function of our matching algorithm calculates a \emph{derivative} of a regular expression. This is a function which will take a regular expression, say $r$, and a character, say $c$, as argument and return -a new regular expression. Beware that the intuition behind this function is not so easy to grasp on first +a new regular expression. Be careful that the intuition behind this function is not so easy to grasp on first reading. Essentially this function solves the following problem: if $r$ can match a string of the form $c\!::\!s$, what does the regular expression look like that can match just $s$. The definition of this function is as follows: @@ -133,7 +134,7 @@ \noindent which essentially transforms a set of strings $A$ by filtering out all strings that do not start with $c$ and then -strip off the $c$ from all the remaining strings. For example suppose $A = \{"f\!oo", "bar", "f\!rak"\}$ then +strips off the $c$ from all the remaining strings. For example suppose $A = \{"f\!oo", "bar", "f\!rak"\}$ then \[ Der\,f\,A = \{"oo", "rak"\}\quad,\quad Der\,b\,A = \{"ar"\} \quad \text{and} \quad @@ -150,10 +151,21 @@ \noindent This property clarifies what regular expression $der$ calculates, namely take the set of strings -that $r$ can match ($L(r)$), filter out all strings not starting with $c$ and strip off the $c$ from the +that $r$ can match (that is $L(r)$), filter out all strings not starting with $c$ and strip off the $c$ from the remaining strings---this is exactly the language that $der\,c\,r$ can match. -For our matching algorithm we need to lift the notion of derivatives from characters to strings. This can be +If we want to find out whether the string $"abc"$ is matched by the regular expression $r$ +then we can iteratively apply $Der$ as follows + +\begin{enumerate} +\item $Der\,a\,(L(r))$ +\item $Der\,b\,(Der\,a\,(L(r)))$ +\item $Der\,c\,(Der\,b\,(Der\,a\,(L(r))))$ +\end{enumerate} + +\noindent +In the last step we need to test whether the empty string is in the set. Our matching algorithm will work similarly, +just using regular expression instead of sets. For this we need to lift the notion of derivatives from characters to strings. This can be done using the following function, taking a string and regular expression as input and a regular expression as output.