55 \multicolumn{3}{c}{values}\medskip\\ |
56 \multicolumn{3}{c}{values}\medskip\\ |
56 $v$ & $::=$ & \\ |
57 $v$ & $::=$ & \\ |
57 & & $Empty$ \\ |
58 & & $Empty$ \\ |
58 & $\mid$ & $Char(c)$ \\ |
59 & $\mid$ & $Char(c)$ \\ |
59 & $\mid$ & $Seq(v_1,v_2)$\\ |
60 & $\mid$ & $Seq(v_1,v_2)$\\ |
60 & $\mid$ & $Left(v)$ \\ |
61 & $\mid$ & $\Left(v)$ \\ |
61 & $\mid$ & $Right(v)$ \\ |
62 & $\mid$ & $Right(v)$ \\ |
62 & $\mid$ & $[v_1,\ldots\,v_n]$ \\ |
63 & $\mid$ & $[v_1,\ldots\,v_n]$ \\ |
63 \end{tabular} |
64 \end{tabular} |
64 \end{tabular} |
65 \end{tabular} |
65 \end{center} |
66 \end{center} |
66 |
67 |
67 \noindent The reason is that there is a very strong |
68 \noindent The reason is that there is a very strong |
68 correspondence between them. There is no value for the |
69 correspondence between them. There is no value for the |
69 $\varnothing$ regular expression, since it does not match any |
70 $\ZERO$ regular expression, since it does not match any |
70 string. Otherwise there is exactly one value corresponding to |
71 string. Otherwise there is exactly one value corresponding to |
71 each regular expression with the exception of $r_1 + r_2$ |
72 each regular expression with the exception of $r_1 + r_2$ |
72 where there are two values, namely $Left(v)$ and $Right(v)$ |
73 where there are two values, namely $\Left(v)$ and $Right(v)$ |
73 corresponding to the two alternatives. Note that $r^*$ is |
74 corresponding to the two alternatives. Note that $r^*$ is |
74 associated with a list of values, one for each copy of $r$ |
75 associated with a list of values, one for each copy of $r$ |
75 that was needed to match the string. This means we might also |
76 that was needed to match the string. This means we might also |
76 return the empty list $[]$, if no copy was needed in case |
77 return the empty list $[]$, if no copy was needed in case |
77 of $r^*$. For sequence, there is exactly one value, composed |
78 of $r^*$. For sequence, there is exactly one value, composed |
78 of two component values ($v_1$ and $v_2$). |
79 of two component values ($v_1$ and $v_2$). |
79 |
80 |
80 To emphasise the connection between regular expressions and |
81 My definition of regular expressions and values in Scala is |
81 values, I have in my implementation the convention that |
82 shown below. I have in my implementation the convention that |
82 regular expressions and values have the same name, except that |
|
83 regular expressions are written entirely with upper-case |
83 regular expressions are written entirely with upper-case |
84 letters, while values just start with a single upper-case |
84 letters, while values just start with a single upper-case |
85 character and the rest are lower-case letters. My definition |
85 character and the rest are lower-case letters. |
86 of regular expressions and values in Scala is shown below. I |
|
87 use this in the REPL of Scala; when I use the Scala compiler I |
|
88 unfortunately need to rename some constructors, because Scala |
|
89 on Macs does not like classes that are called \pcode{EMPTY} |
|
90 and \pcode{Empty}. |
|
91 |
86 |
92 {\small\lstinputlisting[language=Scala,numbers=none] |
87 {\small\lstinputlisting[language=Scala,numbers=none] |
93 {../progs/app01.scala}} |
88 {../progs/app01.scala}} |
94 |
89 |
95 |
90 |
96 {\small\lstinputlisting[language=Scala,numbers=none] |
91 {\small\lstinputlisting[language=Scala,numbers=none] |
97 {../progs/app02.scala}} |
92 {../progs/app02.scala}} |
139 expression has matched the empty string. Its definition |
134 expression has matched the empty string. Its definition |
140 is as follows: |
135 is as follows: |
141 |
136 |
142 \begin{center} |
137 \begin{center} |
143 \begin{tabular}{lcl} |
138 \begin{tabular}{lcl} |
144 $mkeps(\epsilon)$ & $\dn$ & $Empty$\\ |
139 $mkeps(\ONE)$ & $\dn$ & $Empty$\\ |
145 $mkeps(r_1 + r_2)$ & $\dn$ & if $nullable(r_1)$ \\ |
140 $mkeps(r_1 + r_2)$ & $\dn$ & if $nullable(r_1)$ \\ |
146 & & then $Left(mkeps(r_1))$\\ |
141 & & then $\Left(mkeps(r_1))$\\ |
147 & & else $Right(mkeps(r_2))$\\ |
142 & & else $Right(mkeps(r_2))$\\ |
148 $mkeps(r_1 \cdot r_2)$ & $\dn$ & $Seq(mkeps(r_1),mkeps(r_2))$\\ |
143 $mkeps(r_1 \cdot r_2)$ & $\dn$ & $Seq(mkeps(r_1),mkeps(r_2))$\\ |
149 $mkeps(r^*)$ & $\dn$ & $[]$ \\ |
144 $mkeps(r^*)$ & $\dn$ & $[]$ \\ |
150 \end{tabular} |
145 \end{tabular} |
151 \end{center} |
146 \end{center} |
152 |
147 |
153 \noindent There are no cases for $\varnothing$ and $c$, since |
148 \noindent There are no cases for $\ZERO$ and $c$, since |
154 these regular expression cannot match the empty string. Note |
149 these regular expression cannot match the empty string. Note |
155 also that in case of alternatives we give preference to the |
150 also that in case of alternatives we give preference to the |
156 regular expression on the left-hand side. This will become |
151 regular expression on the left-hand side. This will become |
157 important later on. |
152 important later on. |
158 |
153 |
168 will inject the character into. The result of this function is a |
163 will inject the character into. The result of this function is a |
169 new value. The definition of $inj$ is as follows: |
164 new value. The definition of $inj$ is as follows: |
170 |
165 |
171 \begin{center} |
166 \begin{center} |
172 \begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l} |
167 \begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l} |
173 $inj\,(c)\,c\,Empty$ & $\dn$ & $Char\,c$\\ |
168 $inj\,(c)\,c\,Empty$ & $\dn$ & $Char\,c$\\ |
174 $inj\,(r_1 + r_2)\,c\,Left(v)$ & $\dn$ & $Left(inj\,r_1\,c\,v)$\\ |
169 $inj\,(r_1 + r_2)\,c\,\Left(v)$ & $\dn$ & $\Left(inj\,r_1\,c\,v)$\\ |
175 $inj\,(r_1 + r_2)\,c\,Right(v)$ & $\dn$ & $Right(inj\,r_2\,c\,v)$\\ |
170 $inj\,(r_1 + r_2)\,c\,Right(v)$ & $\dn$ & $Right(inj\,r_2\,c\,v)$\\ |
176 $inj\,(r_1 \cdot r_2)\,c\,Seq(v_1,v_2)$ & $\dn$ & $Seq(inj\,r_1\,c\,v_1,v_2)$\\ |
171 $inj\,(r_1 \cdot r_2)\,c\,Seq(v_1,v_2)$ & $\dn$ & $Seq(inj\,r_1\,c\,v_1,v_2)$\\ |
177 $inj\,(r_1 \cdot r_2)\,c\,Left(Seq(v_1,v_2))$ & $\dn$ & $Seq(inj\,r_1\,c\,v_1,v_2)$\\ |
172 $inj\,(r_1 \cdot r_2)\,c\,\Left(Seq(v_1,v_2))$ & $\dn$ & $Seq(inj\,r_1\,c\,v_1,v_2)$\\ |
178 $inj\,(r_1 \cdot r_2)\,c\,Right(v)$ & $\dn$ & $Seq(mkeps(r_1),inj\,r_2\,c\,v)$\\ |
173 $inj\,(r_1 \cdot r_2)\,c\,Right(v)$ & $\dn$ & $Seq(mkeps(r_1),inj\,r_2\,c\,v)$\\ |
179 $inj\,(r^*)\,c\,Seq(v,vs)$ & $\dn$ & $inj\,r\,c\,v\,::\,vs$\\ |
174 $inj\,(r^*)\,c\,Seq(v,vs)$ & $\dn$ & $inj\,r\,c\,v\,::\,vs$\\ |
180 \end{tabular} |
175 \end{tabular} |
181 \end{center} |
176 \end{center} |
182 |
177 |
197 the first phase are as follows: |
192 the first phase are as follows: |
198 |
193 |
199 \begin{center} |
194 \begin{center} |
200 \begin{tabular}{ll} |
195 \begin{tabular}{ll} |
201 $r_1$: & $a \cdot (b \cdot c)$\\ |
196 $r_1$: & $a \cdot (b \cdot c)$\\ |
202 $r_2$: & $\epsilon \cdot (b \cdot c)$\\ |
197 $r_2$: & $\ONE \cdot (b \cdot c)$\\ |
203 $r_3$: & $(\varnothing \cdot (b \cdot c)) + (\epsilon \cdot c)$\\ |
198 $r_3$: & $(\ZERO \cdot (b \cdot c)) + (\ONE \cdot c)$\\ |
204 $r_4$: & $(\varnothing \cdot (b \cdot c)) + ((\varnothing \cdot c) + \epsilon)$\\ |
199 $r_4$: & $(\ZERO \cdot (b \cdot c)) + ((\ZERO \cdot c) + \ONE)$\\ |
205 \end{tabular} |
200 \end{tabular} |
206 \end{center} |
201 \end{center} |
207 |
202 |
208 \noindent According to the simple algorithm, we would test |
203 \noindent According to the simple algorithm, we would test |
209 whether $r_4$ is nullable, which in this case it indeed is. |
204 whether $r_4$ is nullable, which in this case it indeed is. |
210 This means we can use the function $mkeps$ to calculate a |
205 This means we can use the function $mkeps$ to calculate a |
211 value for how $r_4$ was able to match the empty string. |
206 value for how $r_4$ was able to match the empty string. |
212 Remember that this function gives preference for alternatives |
207 Remember that this function gives preference for alternatives |
213 on the left-hand side. However there is only $\epsilon$ on the |
208 on the left-hand side. However there is only $\ONE$ on the |
214 very right-hand side of $r_4$ that matches the empty string. |
209 very right-hand side of $r_4$ that matches the empty string. |
215 Therefore $mkeps$ returns the value |
210 Therefore $mkeps$ returns the value |
216 |
211 |
217 \begin{center} |
212 \begin{center} |
218 $v_4:\;Right(Right(Empty))$ |
213 $v_4:\;Right(Right(Empty))$ |
219 \end{center} |
214 \end{center} |
220 |
215 |
221 \noindent If there had been a $\epsilon$ on the left, then |
216 \noindent If there had been a $\ONE$ on the left, then |
222 $mkeps$ would have returned something of the form |
217 $mkeps$ would have returned something of the form |
223 $Left(\ldots)$. The point is that from this value we can |
218 $\Left(\ldots)$. The point is that from this value we can |
224 directly read off which part of $r_4$ matched the empty |
219 directly read off which part of $r_4$ matched the empty |
225 string: take the right-alternative first, and then the |
220 string: take the right-alternative first, and then the |
226 right-alternative again. Remember $r_4$ is of the form |
221 right-alternative again. Remember $r_4$ is of the form |
227 |
222 |
228 \begin{center} |
223 \begin{center} |
229 $r_4$:\;$(\varnothing \cdot (b \cdot c)) + |
224 $r_4$:\;$(\ZERO \cdot (b \cdot c)) + |
230 ((\varnothing \cdot c) + \underline{\epsilon})$\\ |
225 ((\ZERO \cdot c) + \underline{\ONE})$\\ |
231 \end{center} |
226 \end{center} |
232 |
227 |
233 \noindent the value tells us that the underlined $\epsilon$ |
228 \noindent the value tells us that the underlined $\ONE$ |
234 is responsible for matching the empty string. |
229 is responsible for matching the empty string. |
235 |
230 |
236 Next we have to ``inject'' the last character, that is $c$ in |
231 Next we have to ``inject'' the last character, that is $c$ in |
237 the running example, into this value $v_4$ in order to |
232 the running example, into this value $v_4$ in order to |
238 calculate how $r_3$ could have matched the string $c$. |
233 calculate how $r_3$ could have matched the string $c$. |
331 and return a (rectified) value. Let us first take a look again |
326 and return a (rectified) value. Let us first take a look again |
332 at our simplification rules: |
327 at our simplification rules: |
333 |
328 |
334 \begin{center} |
329 \begin{center} |
335 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}l} |
330 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}l} |
336 $r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ |
331 $r \cdot \ZERO$ & $\mapsto$ & $\ZERO$\\ |
337 $\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ |
332 $\ZERO \cdot r$ & $\mapsto$ & $\ZERO$\\ |
338 $r \cdot \epsilon$ & $\mapsto$ & $r$\\ |
333 $r \cdot \ONE$ & $\mapsto$ & $r$\\ |
339 $\epsilon \cdot r$ & $\mapsto$ & $r$\\ |
334 $\ONE \cdot r$ & $\mapsto$ & $r$\\ |
340 $r + \varnothing$ & $\mapsto$ & $r$\\ |
335 $r + \ZERO$ & $\mapsto$ & $r$\\ |
341 $\varnothing + r$ & $\mapsto$ & $r$\\ |
336 $\ZERO + r$ & $\mapsto$ & $r$\\ |
342 $r + r$ & $\mapsto$ & $r$\\ |
337 $r + r$ & $\mapsto$ & $r$\\ |
343 \end{tabular} |
338 \end{tabular} |
344 \end{center} |
339 \end{center} |
345 |
340 |
346 \noindent Applying them to $r_4$ will require several nested |
341 \noindent Applying them to $r_4$ will require several nested |
347 simplifications in order end up with just $\epsilon$. However, |
342 simplifications in order end up with just $\ONE$. However, |
348 it is possible to apply them in a depth-first, or inside-out, |
343 it is possible to apply them in a depth-first, or inside-out, |
349 manner in order to calculate this simplified regular |
344 manner in order to calculate this simplified regular |
350 expression. |
345 expression. |
351 |
346 |
352 The rectification we can implement by letting simp return not |
347 The rectification we can implement by letting simp return not |
356 the component regular expressions $r_1$ and $r_2.$ This will |
351 the component regular expressions $r_1$ and $r_2.$ This will |
357 return simplified versions, say $r_{1s}$ and $r_{2s}$, of the |
352 return simplified versions, say $r_{1s}$ and $r_{2s}$, of the |
358 component regular expressions (if they can be simplified) but |
353 component regular expressions (if they can be simplified) but |
359 also two rectification functions $f_{1s}$ and $f_{2s}$. We |
354 also two rectification functions $f_{1s}$ and $f_{2s}$. We |
360 need to assemble them in order to obtain a rectified value for |
355 need to assemble them in order to obtain a rectified value for |
361 $r_1 + r_2$. In case $r_{1s}$ simplified to $\varnothing$, we |
356 $r_1 + r_2$. In case $r_{1s}$ simplified to $\ZERO$, we |
362 continue the derivative calculation with $r_{2s}$. The |
357 continue the derivative calculation with $r_{2s}$. The |
363 Sulzmann \& Lu algorithm would return a corresponding value, |
358 Sulzmann \& Lu algorithm would return a corresponding value, |
364 say $v_{2s}$. But now this value needs to be ``rectified'' to |
359 say $v_{2s}$. But now this value needs to be ``rectified'' to |
365 the value |
360 the value |
366 |
361 |
406 \begin{tabular}{l} |
401 \begin{tabular}{l} |
407 $simp(r)$:\\ |
402 $simp(r)$:\\ |
408 \quad case $r = r_1 + r_2$\\ |
403 \quad case $r = r_1 + r_2$\\ |
409 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\ |
404 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\ |
410 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\ |
405 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\ |
411 \qquad case $r_{1s} = \varnothing$: |
406 \qquad case $r_{1s} = \ZERO$: |
412 return $(r_{2s}, \lambda v. \,Right(f_{2s}(v)))$\\ |
407 return $(r_{2s}, \lambda v. \,Right(f_{2s}(v)))$\\ |
413 \qquad case $r_{2s} = \varnothing$: |
408 \qquad case $r_{2s} = \ZERO$: |
414 return $(r_{1s}, \lambda v. \,Left(f_{1s}(v)))$\\ |
409 return $(r_{1s}, \lambda v. \,\Left(f_{1s}(v)))$\\ |
415 \qquad case $r_{1s} = r_{2s}$: |
410 \qquad case $r_{1s} = r_{2s}$: |
416 return $(r_{1s}, \lambda v. \,Left(f_{1s}(v)))$\\ |
411 return $(r_{1s}, \lambda v. \,\Left(f_{1s}(v)))$\\ |
417 \qquad otherwise: |
412 \qquad otherwise: |
418 return $(r_{1s} + r_{2s}, f_{alt}(f_{1s}, f_{2s}))$\\ |
413 return $(r_{1s} + r_{2s}, f_{alt}(f_{1s}, f_{2s}))$\\ |
419 \end{tabular} |
414 \end{tabular} |
420 \end{center} |
415 \end{center} |
421 |
416 |
422 \noindent We first recursively call the simplification with |
417 \noindent We first recursively call the simplification with |
423 $r_1$ and $r_2$. This gives simplified regular expressions, |
418 $r_1$ and $r_2$. This gives simplified regular expressions, |
424 $r_{1s}$ and $r_{2s}$, as well as two rectification functions |
419 $r_{1s}$ and $r_{2s}$, as well as two rectification functions |
425 $f_{1s}$ and $f_{2s}$. We next need to test whether the |
420 $f_{1s}$ and $f_{2s}$. We next need to test whether the |
426 simplified regular expressions are $\varnothing$ so as to make |
421 simplified regular expressions are $\ZERO$ so as to make |
427 further simplifications. In case $r_{1s}$ is $\varnothing$, |
422 further simplifications. In case $r_{1s}$ is $\ZERO$, |
428 then we can return $r_{2s}$ (the other alternative). However |
423 then we can return $r_{2s}$ (the other alternative). However |
429 for this we need to build a corresponding rectification |
424 for this we need to build a corresponding rectification |
430 function, which as said above is |
425 function, which as said above is |
431 |
426 |
432 \begin{center} |
427 \begin{center} |
433 $\lambda v.\,Right(f_{2s}(v))$ |
428 $\lambda v.\,Right(f_{2s}(v))$ |
434 \end{center} |
429 \end{center} |
435 |
430 |
436 \noindent The case where $r_{2s} = \varnothing$ is similar: |
431 \noindent The case where $r_{2s} = \ZERO$ is similar: |
437 We return $r_{1s}$ and rectify with $Left(\_)$ and the |
432 We return $r_{1s}$ and rectify with $\Left(\_)$ and the |
438 other calculated rectification function $f_{1s}$. This gives |
433 other calculated rectification function $f_{1s}$. This gives |
439 |
434 |
440 \begin{center} |
435 \begin{center} |
441 $\lambda v.\,Left(f_{1s}(v))$ |
436 $\lambda v.\,\Left(f_{1s}(v))$ |
442 \end{center} |
437 \end{center} |
443 |
438 |
444 \noindent The next case where $r_{1s} = r_{2s}$ can be treated |
439 \noindent The next case where $r_{1s} = r_{2s}$ can be treated |
445 like the one where $r_{2s} = \varnothing$. We return $r_{1s}$ |
440 like the one where $r_{2s} = \ZERO$. We return $r_{1s}$ |
446 and rectify with $Left(\_)$ and so on. |
441 and rectify with $\Left(\_)$ and so on. |
447 |
442 |
448 |
443 |
449 The otherwise-case is slightly more complicated. In this case |
444 The otherwise-case is slightly more complicated. In this case |
450 neither $r_{1s}$ nor $r_{2s}$ are $\varnothing$ and also |
445 neither $r_{1s}$ nor $r_{2s}$ are $\ZERO$ and also |
451 $r_{1s} \not= r_{2s}$, which means no further simplification |
446 $r_{1s} \not= r_{2s}$, which means no further simplification |
452 can be applied. Accordingly, we return $r_{1s} + r_{2s}$ as |
447 can be applied. Accordingly, we return $r_{1s} + r_{2s}$ as |
453 the simplified regular expression. In principle we also do not |
448 the simplified regular expression. In principle we also do not |
454 have to do any rectification, because no simplification was |
449 have to do any rectification, because no simplification was |
455 done in this case. But this is actually not true: There might |
450 done in this case. But this is actually not true: There might |
456 have been simplifications inside $r_{1s}$ and $r_{2s}$. We |
451 have been simplifications inside $r_{1s}$ and $r_{2s}$. We |
457 therefore need to take into account the calculated |
452 therefore need to take into account the calculated |
458 rectification functions $f_{1s}$ and $f_{2s}$. We can do this |
453 rectification functions $f_{1s}$ and $f_{2s}$. We can do this |
459 by defining a rectification function $f_{alt}$ which takes two |
454 by defining a rectification function $f_{alt}$ which takes two |
460 rectification functions as arguments and applies them |
455 rectification functions as arguments and applies them |
461 according to whether the value is of the form $Left(\_)$ or |
456 according to whether the value is of the form $\Left(\_)$ or |
462 $Right(\_)$: |
457 $Right(\_)$: |
463 |
458 |
464 \begin{center} |
459 \begin{center} |
465 \begin{tabular}{l@{\hspace{1mm}}l} |
460 \begin{tabular}{l@{\hspace{1mm}}l} |
466 $f_{alt}(f_1, f_2) \dn$\\ |
461 $f_{alt}(f_1, f_2) \dn$\\ |
467 \qquad $\lambda v.\,$ case $v = Left(v')$: |
462 \qquad $\lambda v.\,$ case $v = \Left(v')$: |
468 & return $Left(f_1(v'))$\\ |
463 & return $\Left(f_1(v'))$\\ |
469 \qquad \phantom{$\lambda v.\,$} case $v = Right(v')$: |
464 \qquad \phantom{$\lambda v.\,$} case $v = Right(v')$: |
470 & return $Right(f_2(v'))$\\ |
465 & return $Right(f_2(v'))$\\ |
471 \end{tabular} |
466 \end{tabular} |
472 \end{center} |
467 \end{center} |
473 |
468 |
479 \begin{tabular}{l} |
474 \begin{tabular}{l} |
480 $simp(r)$:\qquad\qquad (continued)\\ |
475 $simp(r)$:\qquad\qquad (continued)\\ |
481 \quad case $r = r_1 \cdot r_2$\\ |
476 \quad case $r = r_1 \cdot r_2$\\ |
482 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\ |
477 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\ |
483 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\ |
478 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\ |
484 \qquad case $r_{1s} = \varnothing$: |
479 \qquad case $r_{1s} = \ZERO$: |
485 return $(\varnothing, f_{error})$\\ |
480 return $(\ZERO, f_{error})$\\ |
486 \qquad case $r_{2s} = \varnothing$: |
481 \qquad case $r_{2s} = \ZERO$: |
487 return $(\varnothing, f_{error})$\\ |
482 return $(\ZERO, f_{error})$\\ |
488 \qquad case $r_{1s} = \epsilon$: |
483 \qquad case $r_{1s} = \ONE$: |
489 return $(r_{2s}, \lambda v. \,Seq(f_{1s}(Empty), f_{2s}(v)))$\\ |
484 return $(r_{2s}, \lambda v. \,Seq(f_{1s}(Empty), f_{2s}(v)))$\\ |
490 \qquad case $r_{2s} = \epsilon$: |
485 \qquad case $r_{2s} = \ONE$: |
491 return $(r_{1s}, \lambda v. \,Seq(f_{1s}(v), f_{2s}(Empty)))$\\ |
486 return $(r_{1s}, \lambda v. \,Seq(f_{1s}(v), f_{2s}(Empty)))$\\ |
492 \qquad otherwise: |
487 \qquad otherwise: |
493 return $(r_{1s} \cdot r_{2s}, f_{seq}(f_{1s}, f_{2s}))$\\ |
488 return $(r_{1s} \cdot r_{2s}, f_{seq}(f_{1s}, f_{2s}))$\\ |
494 \end{tabular} |
489 \end{tabular} |
495 \end{center} |
490 \end{center} |
504 \qquad $\lambda v.\,$ case $v = Seq(v_1, v_2)$: |
499 \qquad $\lambda v.\,$ case $v = Seq(v_1, v_2)$: |
505 & return $Seq(f_1(v_1), f_2(v_2))$\\ |
500 & return $Seq(f_1(v_1), f_2(v_2))$\\ |
506 \end{tabular} |
501 \end{tabular} |
507 \end{center} |
502 \end{center} |
508 |
503 |
509 \noindent Note that in the case of $r_{1s} = \varnothing$ |
504 \noindent Note that in the case of $r_{1s} = \ZERO$ |
510 (similarly $r_{2s}$) we use the function $f_{error}$ for |
505 (similarly $r_{2s}$) we use the function $f_{error}$ for |
511 rectification. If you think carefully, then you will realise |
506 rectification. If you think carefully, then you will realise |
512 that this function will actually never been called. This is |
507 that this function will actually never been called. This is |
513 because a sequence with $\varnothing$ will never recognise any |
508 because a sequence with $\ZERO$ will never recognise any |
514 string and therefore the second phase of the algorithm would |
509 string and therefore the second phase of the algorithm would |
515 never been called. The simplification function still expects |
510 never been called. The simplification function still expects |
516 us to give a function. So in my own implementation I just |
511 us to give a function. So in my own implementation I just |
517 returned a function that raises an error. In the case |
512 returned a function that raises an error. In the case |
518 where $r_{1s} = \epsilon$ (similarly $r_{2s}$) we have |
513 where $r_{1s} = \ONE$ (similarly $r_{2s}$) we have |
519 to create a sequence where the first component is a rectified |
514 to create a sequence where the first component is a rectified |
520 version of $Empty$. Therefore we call $f_{1s}$ with $Empty$. |
515 version of $Empty$. Therefore we call $f_{1s}$ with $Empty$. |
521 |
516 |
522 Since we only simplify regular expressions of the form $r_1 + |
517 Since we only simplify regular expressions of the form $r_1 + |
523 r_2$ and $r_1 \cdot r_2$ we do not have to do anything else |
518 r_2$ and $r_1 \cdot r_2$ we do not have to do anything else |
539 \begin{tabular}{l} |
534 \begin{tabular}{l} |
540 $simp(r)$:\\ |
535 $simp(r)$:\\ |
541 \quad case $r = r_1 + r_2$\\ |
536 \quad case $r = r_1 + r_2$\\ |
542 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\ |
537 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\ |
543 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\ |
538 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\ |
544 \qquad case $r_{1s} = \varnothing$: |
539 \qquad case $r_{1s} = \ZERO$: |
545 return $(r_{2s}, \lambda v. \,Right(f_{2s}(v)))$\\ |
540 return $(r_{2s}, \lambda v. \,Right(f_{2s}(v)))$\\ |
546 \qquad case $r_{2s} = \varnothing$: |
541 \qquad case $r_{2s} = \ZERO$: |
547 return $(r_{1s}, \lambda v. \,Left(f_{1s}(v)))$\\ |
542 return $(r_{1s}, \lambda v. \,\Left(f_{1s}(v)))$\\ |
548 \qquad case $r_{1s} = r_{2s}$: |
543 \qquad case $r_{1s} = r_{2s}$: |
549 return $(r_{1s}, \lambda v. \,Left(f_{1s}(v)))$\\ |
544 return $(r_{1s}, \lambda v. \,\Left(f_{1s}(v)))$\\ |
550 \qquad otherwise: |
545 \qquad otherwise: |
551 return $(r_{1s} + r_{2s}, f_{alt}(f_{1s}, f_{2s}))$ |
546 return $(r_{1s} + r_{2s}, f_{alt}(f_{1s}, f_{2s}))$ |
552 \medskip\\ |
547 \medskip\\ |
553 |
548 |
554 \quad case $r = r_1 \cdot r_2$\\ |
549 \quad case $r = r_1 \cdot r_2$\\ |
555 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\ |
550 \qquad let $(r_{1s}, f_{1s}) = simp(r_1)$\\ |
556 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\ |
551 \qquad \phantom{let} $(r_{2s}, f_{2s}) = simp(r_2)$\smallskip\\ |
557 \qquad case $r_{1s} = \varnothing$: |
552 \qquad case $r_{1s} = \ZERO$: |
558 return $(\varnothing, f_{error})$\\ |
553 return $(\ZERO, f_{error})$\\ |
559 \qquad case $r_{2s} = \varnothing$: |
554 \qquad case $r_{2s} = \ZERO$: |
560 return $(\varnothing, f_{error})$\\ |
555 return $(\ZERO, f_{error})$\\ |
561 \qquad case $r_{1s} = \epsilon$: |
556 \qquad case $r_{1s} = \ONE$: |
562 return $(r_{2s}, \lambda v. \,Seq(f_{1s}(Empty), f_{2s}(v)))$\\ |
557 return $(r_{2s}, \lambda v. \,Seq(f_{1s}(Empty), f_{2s}(v)))$\\ |
563 \qquad case $r_{2s} = \epsilon$: |
558 \qquad case $r_{2s} = \ONE$: |
564 return $(r_{1s}, \lambda v. \,Seq(f_{1s}(v), f_{2s}(Empty)))$\\ |
559 return $(r_{1s}, \lambda v. \,Seq(f_{1s}(v), f_{2s}(Empty)))$\\ |
565 \qquad otherwise: |
560 \qquad otherwise: |
566 return $(r_{1s} \cdot r_{2s}, f_{seq}(f_{1s}, f_{2s}))$ |
561 return $(r_{1s} \cdot r_{2s}, f_{seq}(f_{1s}, f_{2s}))$ |
567 \medskip\\ |
562 \medskip\\ |
568 |
563 |