# HG changeset patch # User Christian Urban # Date 1540382833 -3600 # Node ID 6ee22f196884badf748a158f889315bd6bb9c8ef # Parent 529460073b2424ad82d4435d0331e7e06fa606b4 updated diff -r 529460073b24 -r 6ee22f196884 handouts/ho06.pdf Binary file handouts/ho06.pdf has changed diff -r 529460073b24 -r 6ee22f196884 handouts/ho06.tex --- a/handouts/ho06.tex Wed Oct 24 12:49:23 2018 +0100 +++ b/handouts/ho06.tex Wed Oct 24 13:07:13 2018 +0100 @@ -423,21 +423,39 @@ because they do not fit with what the parser is supposed to parse. -A slightly more complicated parser is \pcode{("a" || "b") ~ "c"}. +A slightly more complicated parser is \pcode{("a" || "b") ~ "c"} which parses as first character either an \texttt{a} or \texttt{b} followed by a \texttt{c}. This parser produces the following outputs. \begin{center} \begin{tabular}{rcl} input strings & & output\medskip\\ -\texttt{\Grid{abc}} & $\rightarrow$ & $\left\{((\texttt{\Grid{a}}, \texttt{\Grid{b}}), \texttt{\Grid{c}})\right\}$\\ -\texttt{\Grid{bbc}} & $\rightarrow$ & $\left\{((\texttt{\Grid{b}}, \texttt{\Grid{b}}), \texttt{\Grid{c}})\right\}$\\ -\texttt{\Grid{aac}} & $\rightarrow$ & $\{\}$ +\texttt{\Grid{acde}} & $\rightarrow$ & $\left\{((\texttt{\Grid{a}}, \texttt{\Grid{c}}), \texttt{\Grid{de}})\right\}$\\ +\texttt{\Grid{bcde}} & $\rightarrow$ & $\left\{((\texttt{\Grid{b}}, \texttt{\Grid{c}}), \texttt{\Grid{de}})\right\}$\\ +\texttt{\Grid{abde}} & $\rightarrow$ & $\{\}$ \end{tabular} \end{center} -\noindent Two more examples: first consider the parser \pcode{('a' ~ -'a') ~ 'a'} and the input \pcode{aaaa}: +\noindent +Now consider the parser \pcode{("a" ~ "b") ~ "c"} which parses +\texttt{a}, \texttt{b}, \texttt{c} in sequence. This parser produces +the following outputs. + +\begin{center} +\begin{tabular}{rcl} +input strings & & output\medskip\\ +\texttt{\Grid{abcde}} & $\rightarrow$ & $\left\{(((\texttt{\Grid{a}},\texttt{\Grid{b}}), \texttt{\Grid{c}}), \texttt{\Grid{de}})\right\}$\\ +\texttt{\Grid{abde}} & $\rightarrow$ & $\{\}$\\ +\texttt{\Grid{bcde}} & $\rightarrow$ & $\{\}$ +\end{tabular} +\end{center} + + +\noindent The second and third example fail, because something is +``missing'' in the sequence we are looking for. Also notice how the +results nest with sequences: the parsed part is a nested pair of the +form \pcode{((a, b), c)}. Two more examples: first consider the parser +\pcode{("a" ~ "a") ~ "a"} and the input \pcode{aaaa}: \begin{center} \begin{tabular}{rcl} @@ -447,11 +465,10 @@ \end{tabular} \end{center} -\noindent Notice how the results nest deeper and deeper as -pairs (the last \pcode{a} is in the unprocessed part). To -consume everything of this string we can use the parser -\pcode{(('a' ~'a') ~ 'a') ~ 'a'}. Then the output is as -follows: +\noindent Notice how the results nests deeper and deeper as pairs (the +last \pcode{a} is in the unprocessed part). To consume everything of +this string we can use the parser \pcode{(("a" ~ "a") ~ "a") ~ + "a"}. Then the output is as follows: \begin{center} \begin{tabular}{rcl} @@ -463,7 +480,16 @@ \noindent This is an instance where the parser consumed completely the input, meaning the unprocessed part is just the -empty string. +empty string. So if we called \pcode{parse_all} instead of \pcode{parse} +we would get back the result + +\[ +\left\{(((\texttt{\Grid{a}}, \texttt{\Grid{a}}), \texttt{\Grid{a}}), \texttt{\Grid{a}})\right\} +\] + +\noindent where the unprocessed (empty) parts have been stripped away +from the pairs; everything where the second part was not empty has +been thrown away as ultimately-unsuccessful-parses. Note carefully that constructing a parser such \pcode{'a' ||