35 \texttt{I} and return a set of pairs. The first component of |
35 \texttt{I} and return a set of pairs. The first component of |
36 these pairs corresponds to what the parser combinator was able |
36 these pairs corresponds to what the parser combinator was able |
37 to process from the input and the second is the unprocessed |
37 to process from the input and the second is the unprocessed |
38 part of the input. As we shall see shortly, a parser |
38 part of the input. As we shall see shortly, a parser |
39 combinator might return more than one such pair, the idea |
39 combinator might return more than one such pair, the idea |
40 being that there are potentially several ways how to interpret |
40 being that there are potentially several ways of how to |
41 the input. To simplify matters we will first look at the case |
41 interpret the input. To simplify matters we will first look at |
42 where the input to the parser combinators is just strings. As |
42 the case where the input to the parser combinators is just |
43 a concrete example, consider the string |
43 strings. As a concrete example, consider the string |
44 |
44 |
45 \begin{center} |
45 \begin{center} |
46 \tt\Grid{iffoo\VS testbar} |
46 \tt\Grid{iffoo\VS testbar} |
47 \end{center} |
47 \end{center} |
48 |
48 |
63 recognise anything from the input, then parser combinators just |
63 recognise anything from the input, then parser combinators just |
64 return the empty set $\{\}$. This will indicate something |
64 return the empty set $\{\}$. This will indicate something |
65 ``went wrong''\ldots or more precisely, nothing could be |
65 ``went wrong''\ldots or more precisely, nothing could be |
66 parsed. |
66 parsed. |
67 |
67 |
68 The main attraction of parser combinators is that we can |
68 The idea of parser combinators is that we can easily build |
69 easily build parser combinators out of smaller components |
69 parser combinators out of smaller components following very |
70 following very closely the structure of a grammar. In order to |
70 closely the structure of a grammar. In order to implement this |
71 implement this in an object-oriented programming language, |
71 in an object-oriented programming language, like Scala, we |
72 like Scala, we need to specify an abstract class for parser |
72 need to specify an abstract class for parser combinators. This |
73 combinators. This abstract class requires the implementation |
73 abstract class requires the implementation of the function |
74 of the function \texttt{parse} taking an argument of type |
74 \texttt{parse} taking an argument of type \texttt{I} and |
75 \texttt{I} and returns a set of type \mbox{\texttt{Set[(T, |
75 returns a set of type \mbox{\texttt{Set[(T, I)]}}. |
76 I)]}}. |
|
77 |
76 |
78 \begin{center} |
77 \begin{center} |
79 \begin{lstlisting}[language=Scala] |
78 \begin{lstlisting}[language=Scala] |
80 abstract class Parser[I, T] { |
79 abstract class Parser[I, T] { |
81 def parse(ts: I): Set[(T, I)] |
80 def parse(ts: I): Set[(T, I)] |
130 |
129 |
131 |
130 |
132 More interesting are the parser combinators that build larger |
131 More interesting are the parser combinators that build larger |
133 parsers out of smaller component parsers. For example the |
132 parsers out of smaller component parsers. For example the |
134 alternative parser combinator is as follows: given two |
133 alternative parser combinator is as follows: given two |
135 parsers, say, $p$ and $q$, then we apply both parsers to the |
134 parsers, say, $p$ and $q$, we apply both parsers to the |
136 input (remember parsers are functions) and combine the output |
135 input (remember parsers are functions) and combine the output |
137 (remember they are sets) |
136 (remember they are sets) |
138 |
137 |
139 \begin{center} |
138 \begin{center} |
140 $p(\text{input}) \cup q(\text{input})$ |
139 $p(\text{input}) \cup q(\text{input})$ |
192 \end{tabular} |
191 \end{tabular} |
193 \end{center} |
192 \end{center} |
194 |
193 |
195 \noindent We receive in the first two cases a successful |
194 \noindent We receive in the first two cases a successful |
196 output (that is a non-empty set). In each case, either |
195 output (that is a non-empty set). In each case, either |
197 \pcode{a} or \pcode{b} are in the processed parts, and |
196 \pcode{a} or \pcode{b} is in the processed part, and |
198 \pcode{c} in the unprocessed part. Clearly this parser cannot |
197 \pcode{c} in the unprocessed part. Clearly this parser cannot |
199 parse anything in the string \pcode{cc}, therefore the empty |
198 parse anything in the string \pcode{cc}, therefore the empty |
200 set. |
199 set. |
201 |
200 |
202 A bit more interesting is the \emph{sequence parser |
201 A bit more interesting is the \emph{sequence parser |