23 literature: it contains a very neat idea, but its presentation |
23 literature: it contains a very neat idea, but its presentation |
24 is rather sloppy. In earlier versions of their paper, a King's |
24 is rather sloppy. In earlier versions of their paper, a King's |
25 student and I found several rather annoying typos in their |
25 student and I found several rather annoying typos in their |
26 examples and definitions.} In order to give an answer for |
26 examples and definitions.} In order to give an answer for |
27 \emph{how} a regular expression matches a string, Sulzmann and |
27 \emph{how} a regular expression matches a string, Sulzmann and |
28 Lu introduce \emph{values}. A value will be the output of the |
28 Lu use \emph{values}. A value will be the output of the |
29 algorithm whenever the regular expression matches the string. |
29 algorithm whenever the regular expression matches the string. |
30 If the string does not match the string, an error will be |
30 If the string does not match the string, an error will be |
31 raised. Since the first phase of the algorithm by Sulzmann \& |
31 raised. |
32 Lu is identical to the derivative based matcher from the first |
32 The definition for values is given below. They are shown |
33 coursework, the function \textit{nullable} will be used to |
|
34 decide whether as string is matched by a regular expression. |
|
35 If \textit{nullable} says yes, then values are constructed |
|
36 that reflect how the regular expression matched the string. |
|
37 |
|
38 The definitions for values is given below. They are shown |
|
39 together with the regular expressions $r$ to which |
33 together with the regular expressions $r$ to which |
40 they correspond: |
34 they correspond: |
41 |
35 |
42 \begin{center} |
36 \begin{center} |
43 \begin{tabular}{cc} |
37 \begin{tabular}{cc} |
58 & & $Empty$ \\ |
52 & & $Empty$ \\ |
59 & $\mid$ & $Char(c)$ \\ |
53 & $\mid$ & $Char(c)$ \\ |
60 & $\mid$ & $Seq(v_1,v_2)$\\ |
54 & $\mid$ & $Seq(v_1,v_2)$\\ |
61 & $\mid$ & $\Left(v)$ \\ |
55 & $\mid$ & $\Left(v)$ \\ |
62 & $\mid$ & $Right(v)$ \\ |
56 & $\mid$ & $Right(v)$ \\ |
63 & $\mid$ & $[v_1,\ldots\,v_n]$ \\ |
57 & $\mid$ & $Stars [v_1,\ldots\,v_n]$ \\ |
64 \end{tabular} |
58 \end{tabular} |
65 \end{tabular} |
59 \end{tabular} |
66 \end{center} |
60 \end{center} |
67 |
61 |
68 \noindent The reason is that there is a very strong |
62 \noindent There is no value for the |
69 correspondence between them. There is no value for the |
|
70 $\ZERO$ regular expression, since it does not match any |
63 $\ZERO$ regular expression, since it does not match any |
71 string. Otherwise there is exactly one value corresponding to |
64 string. Otherwise there is exactly one value corresponding to |
72 each regular expression with the exception of $r_1 + r_2$ |
65 each regular expression with the exception of $r_1 + r_2$ |
73 where there are two values, namely $\Left(v)$ and $Right(v)$ |
66 where there are two values, namely $\Left(v)$ and $Right(v)$ |
74 corresponding to the two alternatives. Note that $r^*$ is |
67 corresponding to the two alternatives. Note that $r^*$ is |
75 associated with a list of values, one for each copy of $r$ |
68 associated with a list of values, one for each copy of $r$ |
76 that was needed to match the string. This means we might also |
69 that was needed to match the string. This means we might also |
77 return the empty list $[]$, if no copy was needed in case |
70 return the empty list $Stars []$, if no copy was needed in case |
78 of $r^*$. For sequence, there is exactly one value, composed |
71 of $r^*$. For sequence, there is exactly one value, composed |
79 of two component values ($v_1$ and $v_2$). |
72 of two component values ($v_1$ and $v_2$). |
80 |
73 |
81 My definition of regular expressions and values in Scala is |
74 My definition of regular expressions and values in Scala is |
82 shown below. I have in my implementation the convention that |
75 shown below. I have in my implementation the convention that |