handouts/ho06.tex
changeset 585 6ee22f196884
parent 584 529460073b24
child 586 451a95e1bc25
equal deleted inserted replaced
584:529460073b24 585:6ee22f196884
   421 \emph{not} a simple pair \texttt{(ab, cde)} as one might errorneously
   421 \emph{not} a simple pair \texttt{(ab, cde)} as one might errorneously
   422 expects.  The parser returns the ampty set in the other examples,
   422 expects.  The parser returns the ampty set in the other examples,
   423 because they do not fit with what the parser is supposed to parse.
   423 because they do not fit with what the parser is supposed to parse.
   424 
   424 
   425 
   425 
   426 A slightly more complicated parser is \pcode{("a" || "b") ~ "c"}.
   426 A slightly more complicated parser is \pcode{("a" || "b") ~ "c"}
   427 which parses as first character either an \texttt{a} or \texttt{b}
   427 which parses as first character either an \texttt{a} or \texttt{b}
   428 followed by a \texttt{c}. This parser produces the following outputs.
   428 followed by a \texttt{c}. This parser produces the following outputs.
   429 
   429 
   430 \begin{center}
   430 \begin{center}
   431 \begin{tabular}{rcl}
   431 \begin{tabular}{rcl}
   432 input strings & & output\medskip\\
   432 input strings & & output\medskip\\
   433 \texttt{\Grid{abc}} & $\rightarrow$ & $\left\{((\texttt{\Grid{a}}, \texttt{\Grid{b}}), \texttt{\Grid{c}})\right\}$\\
   433 \texttt{\Grid{acde}} & $\rightarrow$ & $\left\{((\texttt{\Grid{a}}, \texttt{\Grid{c}}), \texttt{\Grid{de}})\right\}$\\
   434 \texttt{\Grid{bbc}} & $\rightarrow$ & $\left\{((\texttt{\Grid{b}}, \texttt{\Grid{b}}), \texttt{\Grid{c}})\right\}$\\
   434 \texttt{\Grid{bcde}} & $\rightarrow$ & $\left\{((\texttt{\Grid{b}}, \texttt{\Grid{c}}), \texttt{\Grid{de}})\right\}$\\
   435 \texttt{\Grid{aac}} & $\rightarrow$ & $\{\}$
   435 \texttt{\Grid{abde}} & $\rightarrow$ & $\{\}$
   436 \end{tabular}
   436 \end{tabular}
   437 \end{center}
   437 \end{center}
   438 
   438 
   439 \noindent Two more examples: first consider the parser \pcode{('a' ~
   439 \noindent
   440 'a') ~ 'a'} and the input \pcode{aaaa}:
   440 Now consider the parser \pcode{("a" ~ "b") ~ "c"} which parses
       
   441 \texttt{a}, \texttt{b}, \texttt{c} in sequence. This parser produces
       
   442 the following outputs.
       
   443 
       
   444 \begin{center}
       
   445 \begin{tabular}{rcl}
       
   446 input strings & & output\medskip\\
       
   447 \texttt{\Grid{abcde}} & $\rightarrow$ & $\left\{(((\texttt{\Grid{a}},\texttt{\Grid{b}}), \texttt{\Grid{c}}), \texttt{\Grid{de}})\right\}$\\
       
   448 \texttt{\Grid{abde}} & $\rightarrow$ & $\{\}$\\
       
   449 \texttt{\Grid{bcde}} & $\rightarrow$ & $\{\}$
       
   450 \end{tabular}
       
   451 \end{center}
       
   452 
       
   453 
       
   454 \noindent The second and third example fail, because something is
       
   455 ``missing'' in the sequence we are looking for. Also notice how the
       
   456 results nest with sequences: the parsed part is a nested pair of the
       
   457 form \pcode{((a, b), c)}. Two more examples: first consider the parser
       
   458 \pcode{("a" ~ "a") ~ "a"} and the input \pcode{aaaa}:
   441 
   459 
   442 \begin{center}
   460 \begin{center}
   443 \begin{tabular}{rcl}
   461 \begin{tabular}{rcl}
   444 input string & & output\medskip\\
   462 input string & & output\medskip\\
   445 \texttt{\Grid{aaaa}} & $\rightarrow$ & 
   463 \texttt{\Grid{aaaa}} & $\rightarrow$ & 
   446 $\left\{(((\texttt{\Grid{a}}, \texttt{\Grid{a}}), \texttt{\Grid{a}}), \texttt{\Grid{a}})\right\}$\\
   464 $\left\{(((\texttt{\Grid{a}}, \texttt{\Grid{a}}), \texttt{\Grid{a}}), \texttt{\Grid{a}})\right\}$\\
   447 \end{tabular}
   465 \end{tabular}
   448 \end{center}
   466 \end{center}
   449 
   467 
   450 \noindent Notice how the results nest deeper and deeper as
   468 \noindent Notice how the results nests deeper and deeper as pairs (the
   451 pairs (the last \pcode{a} is in the unprocessed part). To
   469 last \pcode{a} is in the unprocessed part). To consume everything of
   452 consume everything of this string we can use the parser
   470 this string we can use the parser \pcode{(("a" ~ "a") ~ "a") ~
   453 \pcode{(('a' ~'a') ~ 'a') ~ 'a'}. Then the output is as
   471   "a"}. Then the output is as follows:
   454 follows:
       
   455 
   472 
   456 \begin{center}
   473 \begin{center}
   457 \begin{tabular}{rcl}
   474 \begin{tabular}{rcl}
   458 input string & & output\medskip\\
   475 input string & & output\medskip\\
   459 \texttt{\Grid{aaaa}} & $\rightarrow$ & 
   476 \texttt{\Grid{aaaa}} & $\rightarrow$ & 
   461 \end{tabular}
   478 \end{tabular}
   462 \end{center}
   479 \end{center}
   463 
   480 
   464 \noindent This is an instance where the parser consumed
   481 \noindent This is an instance where the parser consumed
   465 completely the input, meaning the unprocessed part is just the
   482 completely the input, meaning the unprocessed part is just the
   466 empty string.
   483 empty string. So if we called \pcode{parse_all} instead of \pcode{parse}
       
   484 we would get back the result
       
   485 
       
   486 \[
       
   487 \left\{(((\texttt{\Grid{a}}, \texttt{\Grid{a}}), \texttt{\Grid{a}}), \texttt{\Grid{a}})\right\}
       
   488 \]
       
   489 
       
   490 \noindent where the unprocessed (empty) parts have been stripped away
       
   491 from the pairs; everything where the second part was not empty has
       
   492 been thrown away as ultimately-unsuccessful-parses.
   467 
   493 
   468 
   494 
   469 Note carefully that constructing a parser such \pcode{'a' ||
   495 Note carefully that constructing a parser such \pcode{'a' ||
   470 ('a' ~ 'b')} will result in a typing error. The first
   496 ('a' ~ 'b')} will result in a typing error. The first
   471 parser has as output type a single character (recall the type
   497 parser has as output type a single character (recall the type