handouts/ho04.tex
changeset 400 e4afe3f46c29
parent 394 2f9fe225ecc8
child 417 e74c696821a2
equal deleted inserted replaced
399:5c1fbb39c93e 400:e4afe3f46c29
     3 \usepackage{../langs}
     3 \usepackage{../langs}
     4 \usepackage{../graphics}
     4 \usepackage{../graphics}
     5 
     5 
     6 
     6 
     7 \begin{document}
     7 \begin{document}
       
     8 \fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016}
     8 
     9 
     9 \section*{Handout 4 (Sulzmann \& Lu Algorithm)}
    10 \section*{Handout 4 (Sulzmann \& Lu Algorithm)}
    10 
    11 
    11 So far our algorithm based on derivatives was only able to say
    12 So far our algorithm based on derivatives was only able to say
    12 yes or no depending on whether a string was matched by regular
    13 yes or no depending on whether a string was matched by regular
    40 
    41 
    41 \begin{center}
    42 \begin{center}
    42 \begin{tabular}{cc}
    43 \begin{tabular}{cc}
    43 \begin{tabular}{@{}rrl@{}}
    44 \begin{tabular}{@{}rrl@{}}
    44 \multicolumn{3}{c}{regular expressions}\medskip\\
    45 \multicolumn{3}{c}{regular expressions}\medskip\\
    45   $r$ & $::=$  & $\varnothing$\\
    46   $r$ & $::=$  & $\ZERO$\\
    46       & $\mid$ & $\epsilon$   \\
    47       & $\mid$ & $\ONE$   \\
    47       & $\mid$ & $c$          \\
    48       & $\mid$ & $c$          \\
    48       & $\mid$ & $r_1 \cdot r_2$\\
    49       & $\mid$ & $r_1 \cdot r_2$\\
    49       & $\mid$ & $r_1 + r_2$   \\
    50       & $\mid$ & $r_1 + r_2$   \\
    50   \\
    51   \\
    51       & $\mid$ & $r^*$         \\
    52       & $\mid$ & $r^*$         \\
    55 \multicolumn{3}{c}{values}\medskip\\
    56 \multicolumn{3}{c}{values}\medskip\\
    56    $v$ & $::=$  & \\
    57    $v$ & $::=$  & \\
    57       &        & $Empty$   \\
    58       &        & $Empty$   \\
    58       & $\mid$ & $Char(c)$          \\
    59       & $\mid$ & $Char(c)$          \\
    59       & $\mid$ & $Seq(v_1,v_2)$\\
    60       & $\mid$ & $Seq(v_1,v_2)$\\
    60       & $\mid$ & $Left(v)$   \\
    61       & $\mid$ & $\Left(v)$   \\
    61       & $\mid$ & $Right(v)$  \\
    62       & $\mid$ & $Right(v)$  \\
    62       & $\mid$ & $[v_1,\ldots\,v_n]$ \\
    63       & $\mid$ & $[v_1,\ldots\,v_n]$ \\
    63 \end{tabular}
    64 \end{tabular}
    64 \end{tabular}
    65 \end{tabular}
    65 \end{center}
    66 \end{center}
    66 
    67 
    67 \noindent The reason is that there is a very strong
    68 \noindent The reason is that there is a very strong
    68 correspondence between them. There is no value for the
    69 correspondence between them. There is no value for the
    69 $\varnothing$ regular expression, since it does not match any
    70 $\ZERO$ regular expression, since it does not match any
    70 string. Otherwise there is exactly one value corresponding to
    71 string. Otherwise there is exactly one value corresponding to
    71 each regular expression with the exception of $r_1 + r_2$
    72 each regular expression with the exception of $r_1 + r_2$
    72 where there are two values, namely $Left(v)$ and $Right(v)$
    73 where there are two values, namely $\Left(v)$ and $Right(v)$
    73 corresponding to the two alternatives. Note that $r^*$ is
    74 corresponding to the two alternatives. Note that $r^*$ is
    74 associated with a list of values, one for each copy of $r$
    75 associated with a list of values, one for each copy of $r$
    75 that was needed to match the string. This means we might also
    76 that was needed to match the string. This means we might also
    76 return the empty list $[]$, if no copy was needed in case
    77 return the empty list $[]$, if no copy was needed in case
    77 of $r^*$. For sequence, there is exactly one value, composed 
    78 of $r^*$. For sequence, there is exactly one value, composed 
    78 of two component values ($v_1$ and $v_2$).
    79 of two component values ($v_1$ and $v_2$).
    79 
    80 
    80 To emphasise the connection between regular expressions and
    81 My definition of regular expressions and values in Scala is
    81 values, I have in my implementation the convention that
    82 shown below. I have in my implementation the convention that
    82 regular expressions and values have the same name, except that
       
    83 regular expressions are written entirely with upper-case
    83 regular expressions are written entirely with upper-case
    84 letters, while values just start with a single upper-case
    84 letters, while values just start with a single upper-case
    85 character and the rest are lower-case letters. My definition
    85 character and the rest are lower-case letters.
    86 of regular expressions and values in Scala is shown below. I
       
    87 use this in the REPL of Scala; when I use the Scala compiler I
       
    88 unfortunately need to rename some constructors, because Scala
       
    89 on Macs does not like classes that are called \pcode{EMPTY}
       
    90 and \pcode{Empty}.
       
    91  
    86  
    92  {\small\lstinputlisting[language=Scala,numbers=none]
    87 {\small\lstinputlisting[language=Scala,numbers=none]
    93 {../progs/app01.scala}}
    88 {../progs/app01.scala}}
    94 
    89 
    95  
    90  
    96 {\small\lstinputlisting[language=Scala,numbers=none]
    91 {\small\lstinputlisting[language=Scala,numbers=none]
    97 {../progs/app02.scala}}
    92 {../progs/app02.scala}}
   139 expression has matched the empty string. Its definition
   134 expression has matched the empty string. Its definition
   140 is as follows:
   135 is as follows:
   141 
   136 
   142 \begin{center}
   137 \begin{center}
   143 \begin{tabular}{lcl}
   138 \begin{tabular}{lcl}
   144   $mkeps(\epsilon)$       & $\dn$ & $Empty$\\
   139   $mkeps(\ONE)$       & $\dn$ & $Empty$\\
   145   $mkeps(r_1 + r_2)$      & $\dn$ & if $nullable(r_1)$  \\
   140   $mkeps(r_1 + r_2)$      & $\dn$ & if $nullable(r_1)$  \\
   146                           &       & then $Left(mkeps(r_1))$\\
   141                           &       & then $\Left(mkeps(r_1))$\\
   147                           &       & else $Right(mkeps(r_2))$\\
   142                           &       & else $Right(mkeps(r_2))$\\
   148   $mkeps(r_1 \cdot r_2)$  & $\dn$ & $Seq(mkeps(r_1),mkeps(r_2))$\\
   143   $mkeps(r_1 \cdot r_2)$  & $\dn$ & $Seq(mkeps(r_1),mkeps(r_2))$\\
   149   $mkeps(r^*)$            & $\dn$ & $[]$  \\
   144   $mkeps(r^*)$            & $\dn$ & $[]$  \\
   150 \end{tabular}
   145 \end{tabular}
   151 \end{center}
   146 \end{center}
   152 
   147 
   153 \noindent There are no cases for $\varnothing$ and $c$, since
   148 \noindent There are no cases for $\ZERO$ and $c$, since
   154 these regular expression cannot match the empty string. Note
   149 these regular expression cannot match the empty string. Note
   155 also that in case of alternatives we give preference to the
   150 also that in case of alternatives we give preference to the
   156 regular expression on the left-hand side. This will become
   151 regular expression on the left-hand side. This will become
   157 important later on.
   152 important later on.
   158 
   153 
   168 will inject the character into. The result of this function is a
   163 will inject the character into. The result of this function is a
   169 new value. The definition of $inj$ is as follows: 
   164 new value. The definition of $inj$ is as follows: 
   170 
   165 
   171 \begin{center}
   166 \begin{center}
   172 \begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
   167 \begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
   173   $inj\,(c)\,c\,Empty$            & $\dn$  & $Char\,c$\\
   168   $inj\,(c)\,c\,Empty$            & $\dn$ & $Char\,c$\\
   174   $inj\,(r_1 + r_2)\,c\,Left(v)$  & $\dn$  & $Left(inj\,r_1\,c\,v)$\\
   169   $inj\,(r_1 + r_2)\,c\,\Left(v)$ & $\dn$ & $\Left(inj\,r_1\,c\,v)$\\
   175   $inj\,(r_1 + r_2)\,c\,Right(v)$ & $\dn$  & $Right(inj\,r_2\,c\,v)$\\
   170   $inj\,(r_1 + r_2)\,c\,Right(v)$ & $\dn$ & $Right(inj\,r_2\,c\,v)$\\
   176   $inj\,(r_1 \cdot r_2)\,c\,Seq(v_1,v_2)$ & $\dn$  & $Seq(inj\,r_1\,c\,v_1,v_2)$\\
   171   $inj\,(r_1 \cdot r_2)\,c\,Seq(v_1,v_2)$ & $\dn$  & $Seq(inj\,r_1\,c\,v_1,v_2)$\\
   177   $inj\,(r_1 \cdot r_2)\,c\,Left(Seq(v_1,v_2))$ & $\dn$  & $Seq(inj\,r_1\,c\,v_1,v_2)$\\
   172   $inj\,(r_1 \cdot r_2)\,c\,\Left(Seq(v_1,v_2))$ & $\dn$  & $Seq(inj\,r_1\,c\,v_1,v_2)$\\
   178   $inj\,(r_1 \cdot r_2)\,c\,Right(v)$ & $\dn$  & $Seq(mkeps(r_1),inj\,r_2\,c\,v)$\\
   173   $inj\,(r_1 \cdot r_2)\,c\,Right(v)$ & $\dn$  & $Seq(mkeps(r_1),inj\,r_2\,c\,v)$\\
   179   $inj\,(r^*)\,c\,Seq(v,vs)$         & $\dn$  & $inj\,r\,c\,v\,::\,vs$\\
   174   $inj\,(r^*)\,c\,Seq(v,vs)$         & $\dn$  & $inj\,r\,c\,v\,::\,vs$\\
   180 \end{tabular}
   175 \end{tabular}
   181 \end{center}
   176 \end{center}
   182 
   177 
   197 the first phase are as follows:
   192 the first phase are as follows:
   198 
   193 
   199 \begin{center}
   194 \begin{center}
   200 \begin{tabular}{ll}
   195 \begin{tabular}{ll}
   201 $r_1$: & $a \cdot (b \cdot c)$\\
   196 $r_1$: & $a \cdot (b \cdot c)$\\
   202 $r_2$: & $\epsilon \cdot (b \cdot c)$\\
   197 $r_2$: & $\ONE \cdot (b \cdot c)$\\
   203 $r_3$: & $(\varnothing \cdot (b \cdot c)) + (\epsilon \cdot c)$\\
   198 $r_3$: & $(\ZERO \cdot (b \cdot c)) + (\ONE \cdot c)$\\
   204 $r_4$: & $(\varnothing \cdot (b \cdot c)) + ((\varnothing \cdot c) + \epsilon)$\\
   199 $r_4$: & $(\ZERO \cdot (b \cdot c)) + ((\ZERO \cdot c) + \ONE)$\\
   205 \end{tabular}
   200 \end{tabular}
   206 \end{center}
   201 \end{center}
   207 
   202 
   208 \noindent According to the simple algorithm, we would test
   203 \noindent According to the simple algorithm, we would test
   209 whether $r_4$ is nullable, which in this case it indeed is.
   204 whether $r_4$ is nullable, which in this case it indeed is.
   210 This means we can use the function $mkeps$ to calculate a
   205 This means we can use the function $mkeps$ to calculate a
   211 value for how $r_4$ was able to match the empty string.
   206 value for how $r_4$ was able to match the empty string.
   212 Remember that this function gives preference for alternatives
   207 Remember that this function gives preference for alternatives
   213 on the left-hand side. However there is only $\epsilon$ on the
   208 on the left-hand side. However there is only $\ONE$ on the
   214 very right-hand side of $r_4$ that matches the empty string.
   209 very right-hand side of $r_4$ that matches the empty string.
   215 Therefore $mkeps$ returns the value
   210 Therefore $mkeps$ returns the value
   216 
   211 
   217 \begin{center}
   212 \begin{center}
   218 $v_4:\;Right(Right(Empty))$
   213 $v_4:\;Right(Right(Empty))$
   219 \end{center}
   214 \end{center}
   220 
   215 
   221 \noindent If there had been a $\epsilon$ on the left, then
   216 \noindent If there had been a $\ONE$ on the left, then
   222 $mkeps$ would have returned something of the form
   217 $mkeps$ would have returned something of the form
   223 $Left(\ldots)$. The point is that from this value we can
   218 $\Left(\ldots)$. The point is that from this value we can
   224 directly read off which part of $r_4$ matched the empty
   219 directly read off which part of $r_4$ matched the empty
   225 string: take the right-alternative first, and then the
   220 string: take the right-alternative first, and then the
   226 right-alternative again. Remember $r_4$ is of the form
   221 right-alternative again. Remember $r_4$ is of the form
   227 
   222 
   228 \begin{center}
   223 \begin{center}
   229 $r_4$:\;$(\varnothing \cdot (b \cdot c)) + 
   224 $r_4$:\;$(\ZERO \cdot (b \cdot c)) + 
   230          ((\varnothing \cdot c) + \underline{\epsilon})$\\
   225          ((\ZERO \cdot c) + \underline{\ONE})$\\
   231 \end{center}
   226 \end{center}
   232 
   227 
   233 \noindent the value tells us that the underlined $\epsilon$
   228 \noindent the value tells us that the underlined $\ONE$
   234 is responsible for matching the empty string.
   229 is responsible for matching the empty string.
   235 
   230 
   236 Next we have to ``inject'' the last character, that is $c$ in
   231 Next we have to ``inject'' the last character, that is $c$ in
   237 the running example, into this value $v_4$ in order to
   232 the running example, into this value $v_4$ in order to
   238 calculate how $r_3$ could have matched the string $c$.
   233 calculate how $r_3$ could have matched the string $c$.
   241 \begin{center}
   236 \begin{center}
   242 $v_3:\;Right(Seq(Empty, Char(c)))$
   237 $v_3:\;Right(Seq(Empty, Char(c)))$
   243 \end{center}
   238 \end{center}
   244 
   239 
   245 \noindent This is the correct result, because $r_3$ needs
   240 \noindent This is the correct result, because $r_3$ needs
   246 to use the right-hand alternative, and then $\epsilon$ needs
   241 to use the right-hand alternative, and then $\ONE$ needs
   247 to match the empty string and $c$ needs to match $c$.
   242 to match the empty string and $c$ needs to match $c$.
   248 Next we need to inject back the letter $b$ into $v_3$. This
   243 Next we need to inject back the letter $b$ into $v_3$. This
   249 gives
   244 gives
   250 
   245 
   251 \begin{center}
   246 \begin{center}
   270 
   265 
   271 \begin{center}
   266 \begin{center}
   272 \begin{tabular}{lcl}
   267 \begin{tabular}{lcl}
   273   $|Empty|$     & $\dn$ & $[]$\\
   268   $|Empty|$     & $\dn$ & $[]$\\
   274   $|Char(c)|$   & $\dn$ & $[c]$\\
   269   $|Char(c)|$   & $\dn$ & $[c]$\\
   275   $|Left(v)|$   & $\dn$ & $|v|$\\
   270   $|\Left(v)|$   & $\dn$ & $|v|$\\
   276   $|Right(v)|$  & $\dn$ & $|v|$\\
   271   $|Right(v)|$  & $\dn$ & $|v|$\\
   277   $|Seq(v_1,v_2)|$& $\dn$ & $|v_1| \,@\, |v_2|$\\
   272   $|Seq(v_1,v_2)|$& $\dn$ & $|v_1| \,@\, |v_2|$\\
   278   $|[v_1,\ldots ,v_n]|$ & $\dn$ & $|v_1| \,@\ldots @\, |v_n|$\\
   273   $|[v_1,\ldots ,v_n]|$ & $\dn$ & $|v_1| \,@\ldots @\, |v_n|$\\
   279 \end{tabular}
   274 \end{tabular}
   280 \end{center}
   275 \end{center}
   313 first example consider the last derivation step in our earlier
   308 first example consider the last derivation step in our earlier
   314 example:
   309 example:
   315 
   310 
   316 \begin{center}
   311 \begin{center}
   317 $r_4 = der\,c\,r_3 = 
   312 $r_4 = der\,c\,r_3 = 
   318 (\varnothing \cdot (b \cdot c)) + ((\varnothing \cdot c) + \epsilon)$
   313 (\ZERO \cdot (b \cdot c)) + ((\ZERO \cdot c) + \ONE)$
   319 \end{center}
   314 \end{center}
   320 
   315 
   321 \noindent Simplifying this regular expression would just give
   316 \noindent Simplifying this regular expression would just give
   322 us $\epsilon$. Running $mkeps$ with this regular expression as
   317 us $\ONE$. Running $mkeps$ with this regular expression as
   323 input, however, would then provide us with $Empty$ instead of
   318 input, however, would then provide us with $Empty$ instead of
   324 $Right(Right(Empty))$ that was obtained without the
   319 $Right(Right(Empty))$ that was obtained without the
   325 simplification. The problem is we need to recreate this more
   320 simplification. The problem is we need to recreate this more
   326 complicated value, rather than just return $Empty$.
   321 complicated value, rather than just return $Empty$.
   327 
   322 
   331 and return a (rectified) value. Let us first take a look again
   326 and return a (rectified) value. Let us first take a look again
   332 at our simplification rules:
   327 at our simplification rules:
   333 
   328 
   334 \begin{center}
   329 \begin{center}
   335 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}l}
   330 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}l}
   336 $r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ 
   331 $r \cdot \ZERO$ & $\mapsto$ & $\ZERO$\\ 
   337 $\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ 
   332 $\ZERO \cdot r$ & $\mapsto$ & $\ZERO$\\ 
   338 $r \cdot \epsilon$ & $\mapsto$ & $r$\\ 
   333 $r \cdot \ONE$ & $\mapsto$ & $r$\\ 
   339 $\epsilon \cdot r$ & $\mapsto$ & $r$\\ 
   334 $\ONE \cdot r$ & $\mapsto$ & $r$\\ 
   340 $r + \varnothing$ & $\mapsto$ & $r$\\ 
   335 $r + \ZERO$ & $\mapsto$ & $r$\\ 
   341 $\varnothing + r$ & $\mapsto$ & $r$\\ 
   336 $\ZERO + r$ & $\mapsto$ & $r$\\ 
   342 $r + r$ & $\mapsto$ & $r$\\ 
   337 $r + r$ & $\mapsto$ & $r$\\ 
   343 \end{tabular}
   338 \end{tabular}
   344 \end{center}
   339 \end{center}
   345 
   340 
   346 \noindent Applying them to $r_4$ will require several nested
   341 \noindent Applying them to $r_4$ will require several nested
   347 simplifications in order end up with just $\epsilon$. However,
   342 simplifications in order end up with just $\ONE$. However,
   348 it is possible to apply them in a depth-first, or inside-out,
   343 it is possible to apply them in a depth-first, or inside-out,
   349 manner in order to calculate this simplified regular
   344 manner in order to calculate this simplified regular
   350 expression.
   345 expression.
   351 
   346 
   352 The rectification we can implement by letting simp return not
   347 The rectification we can implement by letting simp return not
   356 the component regular expressions $r_1$ and $r_2.$ This will
   351 the component regular expressions $r_1$ and $r_2.$ This will
   357 return simplified versions, say $r_{1s}$ and $r_{2s}$, of the
   352 return simplified versions, say $r_{1s}$ and $r_{2s}$, of the
   358 component regular expressions (if they can be simplified) but
   353 component regular expressions (if they can be simplified) but
   359 also two rectification functions $f_{1s}$ and $f_{2s}$. We
   354 also two rectification functions $f_{1s}$ and $f_{2s}$. We
   360 need to assemble them in order to obtain a rectified value for
   355 need to assemble them in order to obtain a rectified value for
   361 $r_1 + r_2$. In case $r_{1s}$ simplified to $\varnothing$, we
   356 $r_1 + r_2$. In case $r_{1s}$ simplified to $\ZERO$, we
   362 continue the derivative calculation with $r_{2s}$. The
   357 continue the derivative calculation with $r_{2s}$. The
   363 Sulzmann \& Lu algorithm would return a corresponding value,
   358 Sulzmann \& Lu algorithm would return a corresponding value,
   364 say $v_{2s}$. But now this value needs to be ``rectified'' to
   359 say $v_{2s}$. But now this value needs to be ``rectified'' to
   365 the value 
   360 the value 
   366 
   361 
   406 \begin{tabular}{l}
   401 \begin{tabular}{l}
   407 $simp(r)$:\\
   402 $simp(r)$:\\
   408 \quad case $r = r_1 + r_2$\\
   403 \quad case $r = r_1 + r_2$\\
   409 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\
   404 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\
   410 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\
   405 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\
   411 \qquad case $r_{1s} = \varnothing$: 
   406 \qquad case $r_{1s} = \ZERO$: 
   412        return $(r_{2s}, \lambda v. \,Right(f_{2s}(v)))$\\
   407        return $(r_{2s}, \lambda v. \,Right(f_{2s}(v)))$\\
   413 \qquad case $r_{2s} = \varnothing$: 
   408 \qquad case $r_{2s} = \ZERO$: 
   414        return $(r_{1s}, \lambda v. \,Left(f_{1s}(v)))$\\
   409        return $(r_{1s}, \lambda v. \,\Left(f_{1s}(v)))$\\
   415 \qquad case $r_{1s} = r_{2s}$:
   410 \qquad case $r_{1s} = r_{2s}$:
   416        return $(r_{1s}, \lambda v. \,Left(f_{1s}(v)))$\\
   411        return $(r_{1s}, \lambda v. \,\Left(f_{1s}(v)))$\\
   417 \qquad otherwise: 
   412 \qquad otherwise: 
   418        return $(r_{1s} + r_{2s}, f_{alt}(f_{1s}, f_{2s}))$\\
   413        return $(r_{1s} + r_{2s}, f_{alt}(f_{1s}, f_{2s}))$\\
   419 \end{tabular}
   414 \end{tabular}
   420 \end{center}
   415 \end{center}
   421 
   416 
   422 \noindent We first recursively call the simplification with
   417 \noindent We first recursively call the simplification with
   423 $r_1$ and $r_2$. This gives simplified regular expressions,
   418 $r_1$ and $r_2$. This gives simplified regular expressions,
   424 $r_{1s}$ and $r_{2s}$, as well as two rectification functions
   419 $r_{1s}$ and $r_{2s}$, as well as two rectification functions
   425 $f_{1s}$ and $f_{2s}$. We next need to test whether the
   420 $f_{1s}$ and $f_{2s}$. We next need to test whether the
   426 simplified regular expressions are $\varnothing$ so as to make
   421 simplified regular expressions are $\ZERO$ so as to make
   427 further simplifications. In case $r_{1s}$ is $\varnothing$,
   422 further simplifications. In case $r_{1s}$ is $\ZERO$,
   428 then we can return $r_{2s}$ (the other alternative). However
   423 then we can return $r_{2s}$ (the other alternative). However
   429 for this we need to build a corresponding rectification 
   424 for this we need to build a corresponding rectification 
   430 function, which as said above is
   425 function, which as said above is
   431 
   426 
   432 \begin{center}
   427 \begin{center}
   433 $\lambda v.\,Right(f_{2s}(v))$
   428 $\lambda v.\,Right(f_{2s}(v))$
   434 \end{center}
   429 \end{center}
   435 
   430 
   436 \noindent The case where $r_{2s} = \varnothing$ is similar:
   431 \noindent The case where $r_{2s} = \ZERO$ is similar:
   437 We return $r_{1s}$ and rectify with $Left(\_)$ and the
   432 We return $r_{1s}$ and rectify with $\Left(\_)$ and the
   438 other calculated rectification function $f_{1s}$. This gives
   433 other calculated rectification function $f_{1s}$. This gives
   439 
   434 
   440 \begin{center}
   435 \begin{center}
   441 $\lambda v.\,Left(f_{1s}(v))$
   436 $\lambda v.\,\Left(f_{1s}(v))$
   442 \end{center}
   437 \end{center}
   443 
   438 
   444 \noindent The next case where $r_{1s} = r_{2s}$ can be treated
   439 \noindent The next case where $r_{1s} = r_{2s}$ can be treated
   445 like the one where $r_{2s} = \varnothing$. We return $r_{1s}$
   440 like the one where $r_{2s} = \ZERO$. We return $r_{1s}$
   446 and rectify with $Left(\_)$ and so on.
   441 and rectify with $\Left(\_)$ and so on.
   447 
   442 
   448 
   443 
   449 The otherwise-case is slightly more complicated. In this case
   444 The otherwise-case is slightly more complicated. In this case
   450 neither $r_{1s}$ nor $r_{2s}$ are $\varnothing$ and also
   445 neither $r_{1s}$ nor $r_{2s}$ are $\ZERO$ and also
   451 $r_{1s} \not= r_{2s}$, which means no further simplification
   446 $r_{1s} \not= r_{2s}$, which means no further simplification
   452 can be applied. Accordingly, we return $r_{1s} + r_{2s}$ as
   447 can be applied. Accordingly, we return $r_{1s} + r_{2s}$ as
   453 the simplified regular expression. In principle we also do not
   448 the simplified regular expression. In principle we also do not
   454 have to do any rectification, because no simplification was
   449 have to do any rectification, because no simplification was
   455 done in this case. But this is actually not true: There might
   450 done in this case. But this is actually not true: There might
   456 have been simplifications inside $r_{1s}$ and $r_{2s}$. We
   451 have been simplifications inside $r_{1s}$ and $r_{2s}$. We
   457 therefore need to take into account the calculated
   452 therefore need to take into account the calculated
   458 rectification functions $f_{1s}$ and $f_{2s}$. We can do this
   453 rectification functions $f_{1s}$ and $f_{2s}$. We can do this
   459 by defining a rectification function $f_{alt}$ which takes two
   454 by defining a rectification function $f_{alt}$ which takes two
   460 rectification functions as arguments and applies them
   455 rectification functions as arguments and applies them
   461 according to whether the value is of the form $Left(\_)$ or
   456 according to whether the value is of the form $\Left(\_)$ or
   462 $Right(\_)$:
   457 $Right(\_)$:
   463 
   458 
   464 \begin{center}
   459 \begin{center}
   465 \begin{tabular}{l@{\hspace{1mm}}l}
   460 \begin{tabular}{l@{\hspace{1mm}}l}
   466 $f_{alt}(f_1, f_2) \dn$\\
   461 $f_{alt}(f_1, f_2) \dn$\\
   467 \qquad $\lambda v.\,$ case $v = Left(v')$: 
   462 \qquad $\lambda v.\,$ case $v = \Left(v')$: 
   468       & return $Left(f_1(v'))$\\
   463       & return $\Left(f_1(v'))$\\
   469 \qquad \phantom{$\lambda v.\,$} case $v = Right(v')$: 
   464 \qquad \phantom{$\lambda v.\,$} case $v = Right(v')$: 
   470       & return $Right(f_2(v'))$\\      
   465       & return $Right(f_2(v'))$\\      
   471 \end{tabular}
   466 \end{tabular}
   472 \end{center}
   467 \end{center}
   473 
   468 
   479 \begin{tabular}{l}
   474 \begin{tabular}{l}
   480 $simp(r)$:\qquad\qquad (continued)\\
   475 $simp(r)$:\qquad\qquad (continued)\\
   481 \quad case $r = r_1 \cdot r_2$\\
   476 \quad case $r = r_1 \cdot r_2$\\
   482 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\
   477 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\
   483 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\
   478 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\
   484 \qquad case $r_{1s} = \varnothing$: 
   479 \qquad case $r_{1s} = \ZERO$: 
   485        return $(\varnothing, f_{error})$\\
   480        return $(\ZERO, f_{error})$\\
   486 \qquad case $r_{2s} = \varnothing$: 
   481 \qquad case $r_{2s} = \ZERO$: 
   487        return $(\varnothing, f_{error})$\\
   482        return $(\ZERO, f_{error})$\\
   488 \qquad case $r_{1s} = \epsilon$: 
   483 \qquad case $r_{1s} = \ONE$: 
   489 return $(r_{2s}, \lambda v. \,Seq(f_{1s}(Empty), f_{2s}(v)))$\\
   484 return $(r_{2s}, \lambda v. \,Seq(f_{1s}(Empty), f_{2s}(v)))$\\
   490 \qquad case $r_{2s} = \epsilon$: 
   485 \qquad case $r_{2s} = \ONE$: 
   491 return $(r_{1s}, \lambda v. \,Seq(f_{1s}(v), f_{2s}(Empty)))$\\
   486 return $(r_{1s}, \lambda v. \,Seq(f_{1s}(v), f_{2s}(Empty)))$\\
   492 \qquad otherwise: 
   487 \qquad otherwise: 
   493        return $(r_{1s} \cdot r_{2s}, f_{seq}(f_{1s}, f_{2s}))$\\
   488        return $(r_{1s} \cdot r_{2s}, f_{seq}(f_{1s}, f_{2s}))$\\
   494 \end{tabular}
   489 \end{tabular}
   495 \end{center}
   490 \end{center}
   504 \qquad $\lambda v.\,$ case $v = Seq(v_1, v_2)$: 
   499 \qquad $\lambda v.\,$ case $v = Seq(v_1, v_2)$: 
   505       & return $Seq(f_1(v_1), f_2(v_2))$\\
   500       & return $Seq(f_1(v_1), f_2(v_2))$\\
   506 \end{tabular}
   501 \end{tabular}
   507 \end{center}
   502 \end{center}
   508 
   503 
   509 \noindent Note that in the case of $r_{1s} = \varnothing$
   504 \noindent Note that in the case of $r_{1s} = \ZERO$
   510 (similarly $r_{2s}$) we use the function $f_{error}$ for
   505 (similarly $r_{2s}$) we use the function $f_{error}$ for
   511 rectification. If you think carefully, then you will realise
   506 rectification. If you think carefully, then you will realise
   512 that this function will actually never been called. This is
   507 that this function will actually never been called. This is
   513 because a sequence with $\varnothing$ will never recognise any
   508 because a sequence with $\ZERO$ will never recognise any
   514 string and therefore the second phase of the algorithm would
   509 string and therefore the second phase of the algorithm would
   515 never been called. The simplification function still expects
   510 never been called. The simplification function still expects
   516 us to give a function. So in my own implementation I just
   511 us to give a function. So in my own implementation I just
   517 returned a function that raises an error. In the case
   512 returned a function that raises an error. In the case
   518 where $r_{1s} = \epsilon$ (similarly $r_{2s}$) we have
   513 where $r_{1s} = \ONE$ (similarly $r_{2s}$) we have
   519 to create a sequence where the first component is a rectified
   514 to create a sequence where the first component is a rectified
   520 version of $Empty$. Therefore we call $f_{1s}$ with $Empty$.
   515 version of $Empty$. Therefore we call $f_{1s}$ with $Empty$.
   521 
   516 
   522 Since we only simplify regular expressions of the form $r_1 +
   517 Since we only simplify regular expressions of the form $r_1 +
   523 r_2$ and $r_1 \cdot r_2$ we do not have to do anything else
   518 r_2$ and $r_1 \cdot r_2$ we do not have to do anything else
   539 \begin{tabular}{l}
   534 \begin{tabular}{l}
   540 $simp(r)$:\\
   535 $simp(r)$:\\
   541 \quad case $r = r_1 + r_2$\\
   536 \quad case $r = r_1 + r_2$\\
   542 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\
   537 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\
   543 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\
   538 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\
   544 \qquad case $r_{1s} = \varnothing$: 
   539 \qquad case $r_{1s} = \ZERO$: 
   545        return $(r_{2s}, \lambda v. \,Right(f_{2s}(v)))$\\
   540        return $(r_{2s}, \lambda v. \,Right(f_{2s}(v)))$\\
   546 \qquad case $r_{2s} = \varnothing$: 
   541 \qquad case $r_{2s} = \ZERO$: 
   547        return $(r_{1s}, \lambda v. \,Left(f_{1s}(v)))$\\
   542        return $(r_{1s}, \lambda v. \,\Left(f_{1s}(v)))$\\
   548 \qquad case $r_{1s} = r_{2s}$:
   543 \qquad case $r_{1s} = r_{2s}$:
   549        return $(r_{1s}, \lambda v. \,Left(f_{1s}(v)))$\\
   544        return $(r_{1s}, \lambda v. \,\Left(f_{1s}(v)))$\\
   550 \qquad otherwise: 
   545 \qquad otherwise: 
   551        return $(r_{1s} + r_{2s}, f_{alt}(f_{1s}, f_{2s}))$
   546        return $(r_{1s} + r_{2s}, f_{alt}(f_{1s}, f_{2s}))$
   552        \medskip\\
   547        \medskip\\
   553 
   548 
   554 \quad case $r = r_1 \cdot r_2$\\
   549 \quad case $r = r_1 \cdot r_2$\\
   555 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\
   550 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\
   556 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\
   551 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\
   557 \qquad case $r_{1s} = \varnothing$: 
   552 \qquad case $r_{1s} = \ZERO$: 
   558        return $(\varnothing, f_{error})$\\
   553        return $(\ZERO, f_{error})$\\
   559 \qquad case $r_{2s} = \varnothing$: 
   554 \qquad case $r_{2s} = \ZERO$: 
   560        return $(\varnothing, f_{error})$\\
   555        return $(\ZERO, f_{error})$\\
   561 \qquad case $r_{1s} = \epsilon$: 
   556 \qquad case $r_{1s} = \ONE$: 
   562 return $(r_{2s}, \lambda v. \,Seq(f_{1s}(Empty), f_{2s}(v)))$\\
   557 return $(r_{2s}, \lambda v. \,Seq(f_{1s}(Empty), f_{2s}(v)))$\\
   563 \qquad case $r_{2s} = \epsilon$: 
   558 \qquad case $r_{2s} = \ONE$: 
   564 return $(r_{1s}, \lambda v. \,Seq(f_{1s}(v), f_{2s}(Empty)))$\\
   559 return $(r_{1s}, \lambda v. \,Seq(f_{1s}(v), f_{2s}(Empty)))$\\
   565 \qquad otherwise: 
   560 \qquad otherwise: 
   566        return $(r_{1s} \cdot r_{2s}, f_{seq}(f_{1s}, f_{2s}))$
   561        return $(r_{1s} \cdot r_{2s}, f_{seq}(f_{1s}, f_{2s}))$
   567        \medskip\\
   562        \medskip\\
   568 
   563 
   653 
   648 
   654 \begin{center}
   649 \begin{center}
   655 \begin{tabular}{lcl}
   650 \begin{tabular}{lcl}
   656   $env(Empty)$     & $\dn$ & $[]$\\
   651   $env(Empty)$     & $\dn$ & $[]$\\
   657   $env(Char(c))$   & $\dn$ & $[]$\\
   652   $env(Char(c))$   & $\dn$ & $[]$\\
   658   $env(Left(v))$   & $\dn$ & $env(v)$\\
   653   $env(\Left(v))$   & $\dn$ & $env(v)$\\
   659   $env(Right(v))$  & $\dn$ & $env(v)$\\
   654   $env(Right(v))$  & $\dn$ & $env(v)$\\
   660   $env(Seq(v_1,v_2))$& $\dn$ & $env(v_1) \,@\, env(v_2)$\\
   655   $env(Seq(v_1,v_2))$& $\dn$ & $env(v_1) \,@\, env(v_2)$\\
   661   $env([v_1,\ldots ,v_n])$ & $\dn$ & 
   656   $env([v_1,\ldots ,v_n])$ & $\dn$ & 
   662      $env(v_1) \,@\ldots @\, env(v_n)$\\
   657      $env(v_1) \,@\ldots @\, env(v_n)$\\
   663   $env(Rec(x:v))$ & $\dn$ & $(x:|v|) :: env(v)$\\
   658   $env(Rec(x:v))$ & $\dn$ & $(x:|v|) :: env(v)$\\