298 Let us now turn to the problem of generating a parse-tree for a grammar and string. |
298 Let us now turn to the problem of generating a parse-tree for a grammar and string. |
299 In what follows we explain \emph{parser combinators}, because they are easy |
299 In what follows we explain \emph{parser combinators}, because they are easy |
300 to implement and closely resemble grammar rules. Imagine that a grammar |
300 to implement and closely resemble grammar rules. Imagine that a grammar |
301 describes the strings of natural numbers, such as the grammar $N$ shown above. |
301 describes the strings of natural numbers, such as the grammar $N$ shown above. |
302 For all such strings we want to generate the parse-trees or later on we actually |
302 For all such strings we want to generate the parse-trees or later on we actually |
303 want to extract the meaning of these strings, that is the integers ``behind'' these |
303 want to extract the meaning of these strings, that is the concrete integers ``behind'' |
304 strings. The parser will therefore be functions of type |
304 these strings. The parser combinators will be functions of type |
305 |
305 |
306 \begin{center} |
306 \begin{center} |
307 \texttt{I $\Rightarrow$ (T, I)} |
307 \texttt{I $\Rightarrow$ Set[(T, I)]} |
308 \end{center} |
308 \end{center} |
309 |
309 |
310 \noindent |
310 \noindent |
311 that is they take as input something of type \texttt{I}, typically a list of token or a string, |
311 that is they take as input something of type \texttt{I}, typically a list of tokens or a string, |
312 and return a pair. |
312 and return a set of pairs. The first component of these pairs corresponds to what the |
313 |
313 parser combinator was able to process from the input and the second is the unprocessed |
314 |
314 part of the input. As we shall see shortly, a parser combinator might return more than one such pair, |
|
315 with the idea that there are potentially several ways how to interpret the input. |
|
316 |
|
317 The abstract class for parser combinators requires the implementation of the function |
|
318 \texttt{parse} taking an argument of type \texttt{I} and returns a set of type |
|
319 \mbox{\texttt{Set[(T, I)]}}. |
|
320 |
|
321 \begin{center} |
|
322 \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] |
|
323 abstract class Parser[I, T] { |
|
324 def parse(ts: I): Set[(T, I)] |
|
325 |
|
326 def parse_all(ts: I): Set[T] = |
|
327 for ((head, tail) <- parse(ts); if (tail.isEmpty)) |
|
328 yield head |
|
329 } |
|
330 \end{lstlisting} |
|
331 \end{center} |
|
332 |
|
333 \noindent |
|
334 One of the simplest parser combinators recognises just a character, say $c$, |
|
335 from the beginning of strings. Its behaviour is as follows: |
|
336 |
|
337 \begin{itemize} |
|
338 \item if the head of the input string starts with a $c$, it returns |
|
339 the set $\{(c, \textit{tail of}\; s)\}$ |
|
340 \item otherwise it returns the empty set $\varnothing$ |
|
341 \end{itemize} |
|
342 |
|
343 \noindent |
|
344 The input type of this simple parser combinator for characters is |
|
345 \texttt{String} and the output type \mbox{\texttt{Set[(Char, String)]}}. |
|
346 The code in Scala is as follows: |
|
347 |
|
348 \begin{center} |
|
349 \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] |
|
350 case class CharParser(c: Char) extends Parser[String, Char] { |
|
351 def parse(sb: String) = |
|
352 if (sb.head == c) Set((c, sb.tail)) else Set() |
|
353 } |
|
354 \end{lstlisting} |
|
355 \end{center} |
315 |
356 |
316 |
357 |
317 \end{document} |
358 \end{document} |
318 |
359 |
319 %%% Local Variables: |
360 %%% Local Variables: |