handouts/ho06.tex
changeset 392 2d0a59127694
parent 386 31295bb945c6
child 584 529460073b24
equal deleted inserted replaced
391:f352cb238c32 392:2d0a59127694
    35 \texttt{I} and return a set of pairs. The first component of
    35 \texttt{I} and return a set of pairs. The first component of
    36 these pairs corresponds to what the parser combinator was able
    36 these pairs corresponds to what the parser combinator was able
    37 to process from the input and the second is the unprocessed
    37 to process from the input and the second is the unprocessed
    38 part of the input. As we shall see shortly, a parser
    38 part of the input. As we shall see shortly, a parser
    39 combinator might return more than one such pair, the idea
    39 combinator might return more than one such pair, the idea
    40 being that there are potentially several ways how to interpret
    40 being that there are potentially several ways of how to
    41 the input. To simplify matters we will first look at the case
    41 interpret the input. To simplify matters we will first look at
    42 where the input to the parser combinators is just strings. As
    42 the case where the input to the parser combinators is just
    43 a concrete example, consider the string
    43 strings. As a concrete example, consider the string
    44 
    44 
    45 \begin{center}
    45 \begin{center}
    46 \tt\Grid{iffoo\VS testbar}
    46 \tt\Grid{iffoo\VS testbar}
    47 \end{center}
    47 \end{center}
    48 
    48 
    63 recognise anything from the input, then parser combinators just
    63 recognise anything from the input, then parser combinators just
    64 return the empty set $\{\}$. This will indicate something
    64 return the empty set $\{\}$. This will indicate something
    65 ``went wrong''\ldots or more precisely, nothing could be
    65 ``went wrong''\ldots or more precisely, nothing could be
    66 parsed.
    66 parsed.
    67 
    67 
    68 The main attraction of parser combinators is that we can
    68 The idea of parser combinators is that we can easily build
    69 easily build parser combinators out of smaller components
    69 parser combinators out of smaller components following very
    70 following very closely the structure of a grammar. In order to
    70 closely the structure of a grammar. In order to implement this
    71 implement this in an object-oriented programming language,
    71 in an object-oriented programming language, like Scala, we
    72 like Scala, we need to specify an abstract class for parser
    72 need to specify an abstract class for parser combinators. This
    73 combinators. This abstract class requires the implementation
    73 abstract class requires the implementation of the function
    74 of the function \texttt{parse} taking an argument of type
    74 \texttt{parse} taking an argument of type \texttt{I} and
    75 \texttt{I} and returns a set of type \mbox{\texttt{Set[(T,
    75 returns a set of type \mbox{\texttt{Set[(T, I)]}}.
    76 I)]}}.
       
    77 
    76 
    78 \begin{center}
    77 \begin{center}
    79 \begin{lstlisting}[language=Scala]
    78 \begin{lstlisting}[language=Scala]
    80 abstract class Parser[I, T] {
    79 abstract class Parser[I, T] {
    81   def parse(ts: I): Set[(T, I)]
    80   def parse(ts: I): Set[(T, I)]
   130 
   129 
   131 
   130 
   132 More interesting are the parser combinators that build larger
   131 More interesting are the parser combinators that build larger
   133 parsers out of smaller component parsers. For example the
   132 parsers out of smaller component parsers. For example the
   134 alternative parser combinator is as follows: given two
   133 alternative parser combinator is as follows: given two
   135 parsers, say, $p$ and $q$, then we apply both parsers to the
   134 parsers, say, $p$ and $q$, we apply both parsers to the
   136 input (remember parsers are functions) and combine the output
   135 input (remember parsers are functions) and combine the output
   137 (remember they are sets)
   136 (remember they are sets)
   138  
   137  
   139 \begin{center}
   138 \begin{center}
   140 $p(\text{input}) \cup q(\text{input})$
   139 $p(\text{input}) \cup q(\text{input})$
   192 \end{tabular}
   191 \end{tabular}
   193 \end{center}
   192 \end{center}
   194 
   193 
   195 \noindent We receive in the first two cases a successful
   194 \noindent We receive in the first two cases a successful
   196 output (that is a non-empty set). In each case, either
   195 output (that is a non-empty set). In each case, either
   197 \pcode{a} or \pcode{b} are in the processed parts, and
   196 \pcode{a} or \pcode{b} is in the processed part, and
   198 \pcode{c} in the unprocessed part. Clearly this parser cannot
   197 \pcode{c} in the unprocessed part. Clearly this parser cannot
   199 parse anything in the string \pcode{cc}, therefore the empty
   198 parse anything in the string \pcode{cc}, therefore the empty
   200 set.
   199 set.
   201 
   200 
   202 A bit more interesting is the \emph{sequence parser
   201 A bit more interesting is the \emph{sequence parser