\documentclass{article}\usepackage{../style}\usepackage{../langs}\usepackage{../grammar}\begin{document}\section*{Handout 6 (Parser Combinators)}This handout explains how \emph{parser combinators} work and how theycan be implemented in Scala. Their most distinguishing feature is thatthey are very easy to implement (admittedly it is only easy in afunctional programming language). Another good point of parsercombinators is that they can deal with any kind of input as long asthis input is of ``sequence-kind'', for example a string or a list oftokens. The only two properties of the input we need is to be able totest when it is empty and ``sequentially'' take it apart. Strings andlists fit this bill. However, parser combinators also have theirdrawbacks. For example they require that the grammar to be parsed is\emph{not} left-recursive and they are efficient only when the grammaris unambiguous. It is the responsibility of the grammar designer toensure these two properties.The general idea behind parser combinators is to transform the inputinto sets of pairs, like so\begin{center}$\underbrace{\text{list of tokens}}_{\text{input}}$ $\Rightarrow$$\underbrace{\text{set of (parsed input, unparsed input)}}_{\text{output}}$\end{center} \noindentGiven the extended effort we have spent in order to implement a lexerin order to generate list of tokens, it might be surprising that inwhat follows we shall often use strings as input. This is for makingthe explanation more lucid. It does not make our previous work onlexers obsolete (remember they transform a string into a list oftokens). Lexers will still be needed for building a somewhat realisticcompiler.But as said, parser combinators are relatively agnostic about whatkind of input they process. In my Scala code I use the followingpolymorphic types for parser combinators:\begin{center}input:\;\; \texttt{I} \qquad output:\;\; \texttt{T}\end{center}\noindent That is they take as input something of type \texttt{I} andreturn a set of pairs of type \texttt{Set[(T, I)]}. Since the input needsto be of ``sequence-kind'', I actually have to often write \texttt{I <\% Seq[\_]} for the input type. This ensures the input is asubtype of Scala sequences. The first component of the generated pairscorresponds to what the parser combinator was able to process from theinput and the second is the unprocessed part of the input (thereforethe type of this unprocessed part is the same as the input). As weshall see shortly, a parser combinator might return more than one suchpair; the idea is that there are potentially several ways of how toparse the input. As a concrete example, consider the string\begin{center}\tt\Grid{iffoo\VS testbar}\end{center}\noindent We might have a parser combinator which tries tointerpret this string as a keyword (\texttt{if}) or as anidentifier (\texttt{iffoo}). Then the output will be the set\begin{center}$\left\{ \left(\texttt{\Grid{if}}\;,\; \texttt{\Grid{foo\VS testbar}}\right), \left(\texttt{\Grid{iffoo}}\;,\; \texttt{\Grid{\VS testbar}}\right) \right\}$\end{center}\noindent where the first pair means the parser could recognise\texttt{if} from the input and leaves the rest as `unprocessed' as thesecond component of the pair; in the other case it could recognise\texttt{iffoo} and leaves \texttt{\VS testbar} as unprocessed. If theparser cannot recognise anything from the input at all, then parsercombinators just return the empty set $\{\}$. This will indicatesomething ``went wrong''\ldots or more precisely, nothing could beparsed.Also important to note is that the type \texttt{T} for the processedpart is different from the input type. In the example above is justhappens to be the same. The reason for the difference is that ingeneral we are interested in transforming our input into something``different''\ldots for example into a tree; or if we implement thegrammar for arithmetic expressions, we might be interested in theactual integer number the arithmetic expression, say \texttt{1 + 2 * 3}, stands for. In this way we can use parser combinators toimplement relatively easily a calculator, for instance.The main idea of parser combinators is that we can easily build parsercombinators out of smaller components following very closely thestructure of a grammar. In order to implement this in anobject-oriented programming language, like Scala, we need to specifyan abstract class for parser combinators. This abstract class statesthat the function \texttt{parse} takes an argument of type \texttt{I}and returns a set of type \mbox{\texttt{Set[(T, I)]}}.\begin{center}\begin{lstlisting}[language=Scala]abstract class Parser[I, T] { def parse(ts: I): Set[(T, I)] def parse_all(ts: I): Set[T] = for ((head, tail) <- parse(ts); if (tail.isEmpty)) yield head}\end{lstlisting}\end{center}\noindent It is the obligation in each instance (parser combinator) tosupply an implementation for \texttt{parse}. From this function wecan then ``centrally'' derive the function \texttt{parse\_all}, whichjust filters out all pairs whose second component is not empty (thatis has still some unprocessed part). The reason is that at the end ofthe parsing we are only interested in the results where all the inputhas been consumed and no unprocessed part is left over.One of the simplest parser combinators recognises just asingle character, say $c$, from the beginning of strings. Itsbehaviour can be described as follows:\begin{itemize}\item If the head of the input string starts with a $c$, then return the set \[\{(c, \textit{tail of}\; s)\}\] where \textit{tail of} $s$ is the unprocessed part of the input string.\item Otherwise return the empty set $\{\}$. \end{itemize}\noindentThe input type of this simple parser combinator for characters is\texttt{String} and the output type is \texttt{Char}. This means\texttt{parse} returns \mbox{\texttt{Set[(Char, String)]}}. The code in Scala is as follows:\begin{center}\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]case class CharParser(c: Char) extends Parser[String, Char] { def parse(in: String) = if (in.head == c) Set((c, in.tail)) else Set()}\end{lstlisting}\end{center}\noindent You can see the \texttt{parse} tests whether thefirst character of the input string \texttt{in} is equal to\texttt{c}. If yes, then it splits the string into the recognised part\texttt{c} and the unprocessed part \texttt{in.tail}. In case\texttt{in} does not start with \texttt{c} then the parser returns theempty set (in Scala \texttt{Set()}). Since this parser recognisescharacters and just returns characters as the processed part, theoutput type of the parser is \texttt{Char}.If we want to parse a list of tokens and interested in recognising anumber token, we could write something like this\begin{center}\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily,numbers=none]case object NumParser extends Parser[List[Token], Int] { def parse(ts: List[Token]) = ts match { case Num_token(s)::ts => Set((s.toInt, ts)) case _ => Set () }}\end{lstlisting}\end{center}\noindentIn this parser the input is of type \texttt{List[Token]}. The functionparse looks at the input \texttt{ts} and checks whether the firsttoken is a \texttt{Num\_token}. Let us assume our lexer generatedthese tokens for numbers. But this parser does not just return thistoken (and the rest of the list), like the \texttt{CharParser} above,rather extracts the string \texttt{s} from the token and converts itinto an integer. The hope is that the lexer did its work well and thisconversion always succeeds. The consequence of this is that the outputtype for this parser is \texttt{Int}, not \texttt{Token}. Such aconversion would be needed if we want to implement a simple calculatorprogram.These simple parsers that just look at the input and do a simpletransformation are often called \emph{atomic} parser combinators.More interesting are the parser combinators that build larger parsersout of smaller component parsers. There are three such parsercombinators that can be implemented generically. The \emph{alternative parser combinator} is as follows: given two parsers, say, $p$ and$q$, we apply both parsers to the input (remember parsers arefunctions) and combine the output (remember they are sets of pairs):\begin{center}$p(\text{input}) \cup q(\text{input})$\end{center}\noindent In Scala we can implement alternative parsercombinator as follows\begin{center}\begin{lstlisting}[language=Scala, numbers=none]class AltParser[I, T] (p: => Parser[I, T], q: => Parser[I, T]) extends Parser[I, T] { def parse(in: I) = p.parse(in) ++ q.parse(in)}\end{lstlisting}\end{center}\noindent The types of this parser combinator are again generic (wehave \texttt{I} for the input type, and \texttt{T} for the outputtype). The alternative parser builds a new parser out of two existingparsers \texttt{p} and \texttt{q} given as arguments. Both parsersneed to be able to process input of type \texttt{I} and return in\texttt{parse} the same output type \texttt{Set[(T, I)]}.\footnote{There is an interesting detail of Scala, namely the \texttt{=>} in front of the types of \texttt{p} and \texttt{q}. They will prevent the evaluation of the arguments before they are used. This is often called \emph{lazy evaluation} of the arguments. We will explain this later.} The alternative parsershould run the input with the first parser \texttt{p} (producing a setof pairs) and then run the same input with \texttt{q} (producinganother set of pairs). The result should be then just the union ofboth sets, which is the operation \texttt{++} in Scala.The alternative parser combinator allows us to construct a parser thatparses either a character \texttt{a} or \texttt{b} using the\texttt{CharParser} shown above. For this we can write\begin{center}\begin{lstlisting}[language=Scala, numbers=none]new AltParser(CharParser('a'), CharParser('b'))\end{lstlisting}\end{center}\noindent Later on we will use Scala mechanism for introducing somemore readable shorthand notation for this, like \texttt{"a" || "b"}. Let us look in detail at what this parser combinator produceswith some sample strings\begin{center}\begin{tabular}{rcl}input strings & & output\medskip\\\texttt{\Grid{acde}} & $\rightarrow$ & $\left\{(\texttt{\Grid{a}}, \texttt{\Grid{cde}})\right\}$\\\texttt{\Grid{bcde}} & $\rightarrow$ & $\left\{(\texttt{\Grid{b}}, \texttt{\Grid{cde}})\right\}$\\\texttt{\Grid{ccde}} & $\rightarrow$ & $\{\}$\end{tabular}\end{center}\noindent We receive in the first two cases a successfuloutput (that is a non-empty set). In each case, either\pcode{a} or \pcode{b} is in the processed part, and\pcode{cde} in the unprocessed part. Clearly this parser cannotparse anything with \pcode{ccde}, therefore the emptyset is returned.A bit more interesting is the \emph{sequence parser combinator}. Giventwo parsers, say again, $p$ and $q$, we want to apply first the inputto $p$ producing a set of pairs; then apply $q$ to all the unparsedparts in the pairs; and then combine the results. Mathematically we wouldwrite something like this for the expected set of pairs:\begin{center}\begin{tabular}{lcl}$\{((\textit{output}_1, \textit{output}_2), u_2)$ & $\,|\,$ & $(\textit{output}_1, u_1) \in p(\text{input}) \;\wedge\;$\\&& $(\textit{output}_2, u_2) \in q(u_1)\}$\end{tabular}\end{center}\noindent Notice that the $p$ wil first be run on the input, producingpairs of the form $(\textit{output}_1, u_1)$ where the $u_1$ standsfor the unprocessed, or left-over, parts. We want that $q$ runs on allthese unprocessed parts $u_1$. This again will produce someprocessed part , $p$ and$q$, we apply both parsers to the input (remember parsers arefunctions) and combine the output (remember they are sets of pairs)\begin{center}$p(\text{input}) \cup q(\text{input})$\end{center}\noindent In Scala we would implement alternative parsercombinator as follows\begin{center}\begin{lstlisting}[language=Scala, numbers=none]class AltParser[I, T] (p: => Parser[I, T], q: => Parser[I, T]) extends Parser[I, T] { def parse(in: I) = p.parse(in) ++ q.parse(in)}\end{lstlisting}\end{center}\noindent The types of this parser combinator are again generic (wejust have \texttt{I} for the input type, and \texttt{T} for the outputtype). The alternative parser builds a new parser out of two existingparsers \texttt{p} and \texttt{q}. Both need to be able to processinput of type \texttt{I} and return the same output type\texttt{Set[(T, I)]}.\footnote{There is an interesting detail of Scala, namely the \texttt{=>} in front of the types of \texttt{p} and \texttt{q}. They will prevent the evaluation of the arguments before they are used. This is often called \emph{lazy evaluation} of the arguments. We will explain this later.} Therefore the outputtype of this parser is \texttt{T}. The alternative parser should runthe input with the first parser \texttt{p} (producing a set of pairs)and then run the same input with \texttt{q} (producing another set ofpairs). The result should be then just the union of both sets, whichis the operation \texttt{++} in Scala.The alternative parser combinator already allows us toconstruct a parser that parses either a character \texttt{a}or \texttt{b}, as\begin{center}\begin{lstlisting}[language=Scala, numbers=none]new AltParser(CharParser('a'), CharParser('b'))\end{lstlisting}\end{center}\noindent Later on we will use again Scala mechanism for introducing somemore readable shorthand notation for this, like \texttt{"a" || "b"}. Let us look in detail at what this parser combinator produces with some sample strings \begin{center}\begin{tabular}{rcl}input strings & & output\medskip\\\texttt{\Grid{acde}} & $\rightarrow$ & $\left\{(\texttt{\Grid{a}}, \texttt{\Grid{cde}})\right\}$\\\texttt{\Grid{bcde}} & $\rightarrow$ & $\left\{(\texttt{\Grid{b}}, \texttt{\Grid{cde}})\right\}$\\\texttt{\Grid{ccde}} & $\rightarrow$ & $\{\}$\end{tabular}\end{center}\noindent We receive in the first two cases a successfuloutput (that is a non-empty set). In each case, either\pcode{a} or \pcode{b} is in the processed part, and\pcode{cde} in the unprocessed part. Clearly this parser cannotparse anything in the string \pcode{ccde}, therefore the emptyset.A bit more interesting is the \emph{sequence parser combinator}. Giventwo parsers, say again, $p$ and $q$, we want to apply first the inputto $p$ producing a set of pairs; then apply $q$ to all the unparsedparts in the pairs; and then combine the results like\begin{center}\begin{tabular}{lcl}$\{((\textit{output}_1, \textit{output}_2), u_2)$ & $\;|\;$ & $(\textit{output}_1, u_1) \in p(\text{input}) \;\wedge\;$\\&& $(\textit{output}_2, u_2) \in q(u_1)\}$\end{tabular}\end{center}\noindent Notice that the $p$ will first be run on the input,producing pairs of the form $\textit{output}_1$ and unprocessed part$u_1$. This unprocessed part is fed into the second parser $q$. Theoverall result of the sequence parser combinator is pairs of the form$((\textit{output}_1, \textit{output}_2), u_2)$. This means theunprocessed part of both parsers is the unprocessed part the secondparser $q$ produces leaves as left-over. The processed parts of bothparsers is a pair consisting of the outputs of $p$ and $q$, namely$(\textit{output}_1, \textit{output}_2)$. This behaviour can beimplemented in Scala as follows:\begin{center}\begin{lstlisting}[language=Scala,numbers=none]class SeqParser[I, T, S] (p: => Parser[I, T], q: => Parser[I, S]) extends Parser[I, (T, S)] { def parse(in: I) = for ((output1, u1) <- p.parse(in); (output2, u2) <- q.parse(u1)) yield ((output1, output2), u2)}\end{lstlisting}\end{center}\noindent This parser takes again as arguments two parsers, \texttt{p}and \texttt{q}. It implements \texttt{parse} as follows: let first runthe parser \texttt{p} on the input producing a set of pairs(\texttt{output1}, \texttt{u1}). The \texttt{u1} stands for theunprocessed parts left over by \texttt{p}. Let then \texttt{q} run on theseunprocessed parts producing again a set of pairs. The output of thesequence parser combinator is then a set containing pairs where thefirst components are again pairs, namely what the first parser couldparse together with what the second parser could parse; the secondcomponent is the unprocessed part left over after running the secondparser \texttt{q}. Therefore the input type of the sequence parsercombinator is as usual \texttt{I}, but the output type is\begin{center}\texttt{Set[((T, S), I)]}\end{center}\noindentIf any of the runs of \textit{p} and \textit{q} fail, that is producethe empty set, then \texttt{parse} will also produce the empty set.Notice that we have now two output types for the sequence parsercombinator, because in general \textit{p} and \textit{q} might producedifferent things (for example first we recognise a number and then astring corresponding to an operator).With the shorthand notation we shall introduce later for the sequenceparser combinator, we can write for example \pcode{"a" ~ "b"}, whichis the parser combinator that first recognises the character\texttt{a} from a string and then \texttt{b}. Let us look again atthree examples of how this parser combinator processes some strings:\begin{center}\begin{tabular}{rcl}input strings & & output\medskip\\\texttt{\Grid{abcde}} & $\rightarrow$ & $\left\{((\texttt{\Grid{a}}, \texttt{\Grid{b}}), \texttt{\Grid{cde}})\right\}$\\\texttt{\Grid{bacde}} & $\rightarrow$ & $\{\}$\\\texttt{\Grid{cccde}} & $\rightarrow$ & $\{\}$\end{tabular}\end{center}\noindent In the first line we have a successful parse, because thestring starts with \texttt{ab}, which is the prefix we are lookingfor. But since the parsing combinator is constructed as sequence ofthe two simple (atomic) parsers for \texttt{a} and \texttt{b}, theresult is a nested pair of the form \texttt{((a, b), cde)}. It is\emph{not} a simple pair \texttt{(ab, cde)} as one might erroneouslyexpect. The parser returns the empty set in the other examples,because they do not fit with what the parser is supposed to parse.A slightly more complicated parser is \pcode{("a" || "b") ~ "c"} whichparses as first character either an \texttt{a} or \texttt{b}, followedby a \texttt{c}. This parser produces the following outputs.\begin{center}\begin{tabular}{rcl}input strings & & output\medskip\\\texttt{\Grid{acde}} & $\rightarrow$ & $\left\{((\texttt{\Grid{a}}, \texttt{\Grid{c}}), \texttt{\Grid{de}})\right\}$\\\texttt{\Grid{bcde}} & $\rightarrow$ & $\left\{((\texttt{\Grid{b}}, \texttt{\Grid{c}}), \texttt{\Grid{de}})\right\}$\\\texttt{\Grid{abde}} & $\rightarrow$ & $\{\}$\end{tabular}\end{center}\noindentNow consider the parser \pcode{("a" ~ "b") ~ "c"} which parses\texttt{a}, \texttt{b}, \texttt{c} in sequence. This parser producesthe following outputs.\begin{center}\begin{tabular}{rcl}input strings & & output\medskip\\\texttt{\Grid{abcde}} & $\rightarrow$ & $\left\{(((\texttt{\Grid{a}},\texttt{\Grid{b}}), \texttt{\Grid{c}}), \texttt{\Grid{de}})\right\}$\\\texttt{\Grid{abde}} & $\rightarrow$ & $\{\}$\\\texttt{\Grid{bcde}} & $\rightarrow$ & $\{\}$\end{tabular}\end{center}\noindent The second and third example fail, because something is``missing'' in the sequence we are looking for. Also notice how theresults nest with sequences: the parsed part is a nested pair of theform \pcode{((a, b), c)}. Two more examples: first consider the parser\pcode{("a" ~ "a") ~ "a"} and the input \pcode{aaaa}:\begin{center}\begin{tabular}{rcl}input string & & output\medskip\\\texttt{\Grid{aaaa}} & $\rightarrow$ & $\left\{(((\texttt{\Grid{a}}, \texttt{\Grid{a}}), \texttt{\Grid{a}}), \texttt{\Grid{a}})\right\}$\\\end{tabular}\end{center}\noindent Notice how again the results nest deeper and deeper as pairs (thelast \pcode{a} is in the unprocessed part). To consume everything ofthis string we can use the parser \pcode{(("a" ~ "a") ~ "a") ~ "a"}. Then the output is as follows:\begin{center}\begin{tabular}{rcl}input string & & output\medskip\\\texttt{\Grid{aaaa}} & $\rightarrow$ & $\left\{((((\texttt{\Grid{a}}, \texttt{\Grid{a}}), \texttt{\Grid{a}}), \texttt{\Grid{a}}), \texttt{""})\right\}$\\\end{tabular}\end{center}\noindent This is an instance where the parser consumedcompletely the input, meaning the unprocessed part is just theempty string. So if we called \pcode{parse_all}, instead of \pcode{parse},we would get back the result\[\left\{(((\texttt{\Grid{a}}, \texttt{\Grid{a}}), \texttt{\Grid{a}}), \texttt{\Grid{a}})\right\}\]\noindent where the unprocessed (empty) parts have been stripped awayfrom the pairs; everything where the second part was not empty hasbeen thrown away as well, because they representultimately-unsuccessful-parses.Note carefully that constructing a parser such \pcode{"a" || ("a" ~ "b")} will result in a typing error. The intention is that we wantto parse an \texttt{a}, or an \texttt{a} followed by a\texttt{b}. However, the first parser has as output type a singlecharacter (recall the type of \texttt{CharParser}), but the secondparser produces a pair of characters as output. The alternative parseris required to have both component parsers to have the same type---weneed to be able to build the union of two sets, which means in Scalathey need to be of the same type. We will see later how we can buildthis parser without the typing error.The next parser combinator, called \emph{semantic action}, does notactually combine smaller parsers, but applies a function to the resultof a parser. It is implemented in Scala as follows\begin{center}\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]class FunParser[I, T, S] (p: => Parser[I, T], f: T => S) extends Parser[I, S] { def parse(in: I) = for ((head, tail) <- p.parse(in)) yield (f(head), tail)}\end{lstlisting}\end{center}\noindent This parser combinator takes a parser \texttt{p}with output type \texttt{T} as one argument as well as afunction \texttt{f} with type \texttt{T => S}. The parser\texttt{p} produces sets of type \texttt{(T, I)}. Thesemantic action combinastor then applies the function\texttt{f} to all the parser outputs. Since this function is oftype \texttt{T => S}, we obtain a parser with output type\texttt{S}. Again Scala lets us introduce some shorthandnotation for this parser combinator. Therefore we will write\texttt{p ==> f} for it.What are semantic actions good for? Well, they allow is to transformthe parsed input into a datastructure we can use for furtherprocessing. A simple example would be to transform parsed charactersinto ASCII numbers. Suppose we define a function \texttt{f} (fromcharacters to ints) and use a \texttt{CharParser} for the character \texttt{c}.\begin{center}\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]val f = (c: Char) => c.toIntval c = new CharParser('c')\end{lstlisting}\end{center}\noindentThen we can run the following parsers on the input \texttt{cbd}:\begin{center}\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]c.parse("cbd")(c ==> f).parse("cbd")\end{lstlisting}\end{center}\noindentThe first line we obtain the result \texttt{Set(('c', "bd"))}, whereas the secondproduces \texttt{Set((99, "bd"))}---the character has been transformed intoan ASCII number.A slightly less contrived example is about parsing numbers (recall\texttt{NumParser} above). However, we want to do this here for strings.For this assume we have the following \texttt{RegexParser}.\begin{center} \begin{lstlisting}[language=Scala,xleftmargin=0mm, basicstyle=\small\ttfamily, numbers=none]import scala.util.matching.Regexcase class RegexParser(reg: Regex) extends Parser[String, String] { def parse(in: String) = reg.findPrefixMatchOf(in) match { case None => Set() case Some(m) => Set((m.matched, m.after.toString)) }}\end{lstlisting}\end{center}\noindentThis parser takes a regex as argument and splits up a string into aprefix and the rest according to this regex(\texttt{reg.findPrefixMatchOf} generates a match---in the successfulcase---and the corresponding strings can be extracted with\texttt{matched} and \texttt{after}). We can now define a\texttt{NumParser} for strings as follows:\begin{center}\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]val NumParser = RegexParser("[0-9]+".r)\end{lstlisting}\end{center}\noindentThis parser will recognise a number at the beginning of a string, forexample\begin{center}\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]NumParser.parse("123abc")\end{lstlisting}\end{center} \noindentproduces \texttt{Set((123,abc))}. The problem is that \texttt{123} isstill a string. We need to convert it into the corresponding\texttt{Int}. We can do this as follows\begin{center}\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none](NumParser ==> (s => s.toInt)).parse("123abc")\end{lstlisting}\end{center} \noindentThe semantic action in form of a function converts a string into an\texttt{Int}. Let us come back to semantic actions when we are goingto implement a simple calculator.\subsubsection*{Shorthand notation for parser combinators}Before we proceed, let us just explain the shorthand notation forparser combinators. Like for regular expressions, the shorthand notationwill make our life much easier when writing actual parsers.\subsubsection*{How to build parsers using parser combinators?}The beauty of parser combinators is the ease with which they can beimplemented and how easy it is to translate context-free grammars intocode (though the grammars need to be non-left-recursive). Todemonstrate this recall our context-free grammar for palindromes:\begin{plstx}[margin=3cm]: \meta{P} ::= a\cdot \meta{P}\cdot a | b\cdot \meta{P}\cdot b | a | b | \epsilon\\\end{plstx}\noindentGiven the parser combinators for alternatives and sequeneces, this grammar should bestraightforward to implement. The first idea would be\begin{center}\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]lazy val Pal : Parser[String, String] = (("a" ~ Pal ~ "a") | ("b" ~ Pal ~ "b") | "a" | "b" | "")\end{lstlisting}\end{center}\noindentUnfortunately, this does not work as it produces a typing error. Thereason is that the parsers \texttt{"a"}, \texttt{"b"} and \texttt{""}all produce strings and therefore can be put into an alternative\texttt{...| "a" | "b" | ""}. But both \pcode{"a" ~ Pal ~ "a"} and\pcode{"b" ~ Pal ~ "b"} produce pairs of the form$(((\_, \_), \_), \_)$---that is how the sequence parser combinatornests results when \pcode{\~} is used between two components. Thesolution is to use a semantic action that ``flattens'' these pairs andappends the corresponding strings, like\begin{center}\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]lazy val Pal : Parser[String, String] = (("a" ~ Pal ~ "a") ==> { case ((x, y), z) => x + y + z } | ("b" ~ Pal ~ "b") ==> { case ((x, y), z) => x + y + z } | "a" | "b" | "")\end{lstlisting}\end{center}\subsubsection*{Implementing an Interpreter}%\bigskip%takes advantage of the full generality---have a look%what it produces if we call it with the string \texttt{abc}%%\begin{center}%\begin{tabular}{rcl}%input string & & output\medskip\\%\texttt{\Grid{abc}} & $\rightarrow$ & $\left\{((\texttt{\Grid{a}}, \texttt{\Grid{b}}), \texttt{\Grid{c}})\right\}$\\%\texttt{\Grid{bbc}} & $\rightarrow$ & $\left\{((\texttt{\Grid{b}}, \texttt{\Grid{b}}), \texttt{\Grid{c}})\right\}$\\%\texttt{\Grid{aac}} & $\rightarrow$ & $\varnothing$%\end{tabular}%\end{center}\end{document}%%% Local Variables: %%% mode: latex %%% TeX-master: t%%% End: