handouts/ho04.tex
changeset 417 e74c696821a2
parent 400 e4afe3f46c29
child 422 5deefcc8cffa
equal deleted inserted replaced
416:357c395ae838 417:e74c696821a2
    23 literature: it contains a very neat idea, but its presentation
    23 literature: it contains a very neat idea, but its presentation
    24 is rather sloppy. In earlier versions of their paper, a King's
    24 is rather sloppy. In earlier versions of their paper, a King's
    25 student and I found several rather annoying typos in their
    25 student and I found several rather annoying typos in their
    26 examples and definitions.} In order to give an answer for
    26 examples and definitions.} In order to give an answer for
    27 \emph{how} a regular expression matches a string, Sulzmann and
    27 \emph{how} a regular expression matches a string, Sulzmann and
    28 Lu introduce \emph{values}. A value will be the output of the
    28 Lu use \emph{values}. A value will be the output of the
    29 algorithm whenever the regular expression matches the string.
    29 algorithm whenever the regular expression matches the string.
    30 If the string does not match the string, an error will be
    30 If the string does not match the string, an error will be
    31 raised. Since the first phase of the algorithm by Sulzmann \&
    31 raised. 
    32 Lu is identical to the derivative based matcher from the first
    32 The definition for values is given below. They are shown 
    33 coursework, the function \textit{nullable} will be used to
       
    34 decide whether as string is matched by a regular expression.
       
    35 If \textit{nullable} says yes, then values are constructed
       
    36 that reflect how the regular expression matched the string. 
       
    37 
       
    38 The definitions for values is given below. They are shown 
       
    39 together with the regular expressions $r$ to which
    33 together with the regular expressions $r$ to which
    40 they correspond:
    34 they correspond:
    41 
    35 
    42 \begin{center}
    36 \begin{center}
    43 \begin{tabular}{cc}
    37 \begin{tabular}{cc}
    58       &        & $Empty$   \\
    52       &        & $Empty$   \\
    59       & $\mid$ & $Char(c)$          \\
    53       & $\mid$ & $Char(c)$          \\
    60       & $\mid$ & $Seq(v_1,v_2)$\\
    54       & $\mid$ & $Seq(v_1,v_2)$\\
    61       & $\mid$ & $\Left(v)$   \\
    55       & $\mid$ & $\Left(v)$   \\
    62       & $\mid$ & $Right(v)$  \\
    56       & $\mid$ & $Right(v)$  \\
    63       & $\mid$ & $[v_1,\ldots\,v_n]$ \\
    57       & $\mid$ & $Stars [v_1,\ldots\,v_n]$ \\
    64 \end{tabular}
    58 \end{tabular}
    65 \end{tabular}
    59 \end{tabular}
    66 \end{center}
    60 \end{center}
    67 
    61 
    68 \noindent The reason is that there is a very strong
    62 \noindent There is no value for the
    69 correspondence between them. There is no value for the
       
    70 $\ZERO$ regular expression, since it does not match any
    63 $\ZERO$ regular expression, since it does not match any
    71 string. Otherwise there is exactly one value corresponding to
    64 string. Otherwise there is exactly one value corresponding to
    72 each regular expression with the exception of $r_1 + r_2$
    65 each regular expression with the exception of $r_1 + r_2$
    73 where there are two values, namely $\Left(v)$ and $Right(v)$
    66 where there are two values, namely $\Left(v)$ and $Right(v)$
    74 corresponding to the two alternatives. Note that $r^*$ is
    67 corresponding to the two alternatives. Note that $r^*$ is
    75 associated with a list of values, one for each copy of $r$
    68 associated with a list of values, one for each copy of $r$
    76 that was needed to match the string. This means we might also
    69 that was needed to match the string. This means we might also
    77 return the empty list $[]$, if no copy was needed in case
    70 return the empty list $Stars []$, if no copy was needed in case
    78 of $r^*$. For sequence, there is exactly one value, composed 
    71 of $r^*$. For sequence, there is exactly one value, composed 
    79 of two component values ($v_1$ and $v_2$).
    72 of two component values ($v_1$ and $v_2$).
    80 
    73 
    81 My definition of regular expressions and values in Scala is
    74 My definition of regular expressions and values in Scala is
    82 shown below. I have in my implementation the convention that
    75 shown below. I have in my implementation the convention that