afl-material: comparison handouts/ho06.tex

equal deleted inserted replaced

-:c6a1e19e9801
+:863e502f6a5c
 test when it is empty and ``sequentially'' take it apart. Strings and
 lists fit this bill. However, parser combinators also have their
 drawbacks. For example they require that the grammar to be parsed is
 \emph{not} left-recursive and they are efficient only when the grammar
 is unambiguous. It is the responsibility of the grammar designer to
-ensure these two properties.
+ensure these two properties hold.
 The general idea behind parser combinators is to transform the input
 into sets of pairs, like so
 \begin{center}
 $\underbrace{\text{list of tokens}}_{\text{input}}$
 $\Rightarrow$
-$\underbrace{\text{set of (parsed input, unparsed input)}}_{\text{output}}$
+$\underbrace{\text{set of (parsed part, unprocessed part)}}_{\text{output}}$
 \end{center}
 \noindent
 Given the extended effort we have spent implementing a lexer in order
-to generate list of tokens, it might be surprising that in what
+to generate lists of tokens, it might be surprising that in what
-follows we shall often use strings as input, rather than list of
+follows we shall often use strings as input, rather than lists of
-tokens. This is for making the explanation more lucid. It does not
+tokens. This is for making the explanation more lucid and for quick
-make our previous work on lexers obsolete (remember they transform a
+examples. It does not make our previous work on lexers obsolete
-string into a list of tokens). Lexers will still be needed for
+(remember they transform a string into a list of tokens). Lexers will
-building a somewhat realistic compiler.
+still be needed for building a somewhat realistic compiler.
 As mentioned above, parser combinators are relatively agnostic about what
 kind of input they process. In my Scala code I use the following
 polymorphic types for parser combinators:
 \end{center}
 \noindent That is they take as input something of type \texttt{I} and
 return a set of pairs of type \texttt{Set[(T, I)]}. Since the input
 needs to be of ``sequence-kind'', I actually have to often write
-\texttt{I <\% Seq[\_]} for the input type. This ensures the input is a
+\texttt{I <\% Seq[\_]} for the input type. This ensures the
-subtype of Scala sequences. The first component of the generated pairs
+input is a subtype of Scala sequences. The first component of the
-corresponds to what the parser combinator was able to process from the
+generated pairs corresponds to what the parser combinator was able to
-input and the second is the unprocessed part, or leftover, of the
+parse from the input and the second is the unprocessed, or
-input (therefore the type of this unprocessed part is the same as the
+leftover, part of the input (therefore the type of this unprocessed part is
-input). A parser combinator might return more than one such pair; the
+the same as the input). A parser combinator might return more than one
-idea is that there are potentially several ways of how to parse the
+such pair; the idea is that there are potentially several ways of how
-input.  As a concrete example, consider the string
+to parse the input.  As a concrete example, consider the string
 \begin{center}
 \tt\Grid{iffoo\VS testbar}
 \end{center}
 \left(\texttt{\Grid{iffoo}}\;,\; \texttt{\Grid{\VS testbar}}\right) \right\}$
 \end{center}
 \noindent where the first pair means the parser could recognise
 \texttt{if} from the input and leaves the \texttt{foo\VS testbar} as
-`unprocessed' part; in the other case it could recognise
+unprocessed part; in the other case it could recognise
 \texttt{iffoo} and leaves \texttt{\VS testbar} as unprocessed. If the
 parser cannot recognise anything from the input at all, then parser
 combinators just return the empty set $\{\}$. This will indicate
 something ``went wrong''\ldots or more precisely, nothing could be
 parsed.
 Also important to note is that the type \texttt{T} for the processed
 part is different from the input type \texttt{I} in the parse. In the
 example above is just happens to be the same. The reason for the
-potential difference is that in general we are interested in
+difference is that in general we are interested in
 transforming our input into something ``different''\ldots for example
 into a tree; or if we implement the grammar for arithmetic
 expressions, we might be interested in the actual integer number the
 arithmetic expression, say \texttt{1 + 2 * 3}, stands for. In this way
 we can use parser combinators to implement relatively easily a
 calculator, for instance.
 The main idea of parser combinators is that we can easily build parser
 combinators out of smaller components following very closely the
-structure of a grammar. In order to implement this in an
+structure of a grammar. In order to implement this in a
-object-oriented programming language, like Scala, we need to specify
+functional/object-oriented programming language, like Scala, we need
-an abstract class for parser combinators. This abstract class states
+to specify an abstract class for parser combinators. In the abstract
-that the function \texttt{parse} takes an argument of type \texttt{I}
+class we specify that \texttt{I} is the \emph{input type} of the
-and returns a set of type \mbox{\texttt{Set[(T, I)]}}.
+parser combinator and that \texttt{T} is the \emph{ouput type}.  This
+implies that the function \texttt{parse} takes an argument of type
+\texttt{I} and returns a set of type \mbox{\texttt{Set[(T, I)]}}.
 \begin{center}
 \begin{lstlisting}[language=Scala]
 abstract class Parser[I, T] {
 def parse(in: I) : Set[(T, I)]
 yield head
 }
 \end{lstlisting}
 \end{center}
-\noindent It is the obligation in each instance (parser combinator) to
+\noindent It is the obligation in each instance of this class to
 supply an implementation for \texttt{parse}.  From this function we
 can then ``centrally'' derive the function \texttt{parse\_all}, which
 just filters out all pairs whose second component is not empty (that
 is has still some unprocessed part). The reason is that at the end of
 the parsing we are only interested in the results where all the input
 \end{tabular}
 \end{center}
 \noindent We receive in the first two cases a successful
 output (that is a non-empty set). In each case, either
-\pcode{a} or \pcode{b} is in the processed part, and
+\pcode{a} or \pcode{b} is in the parsed part, and
 \pcode{cde} in the unprocessed part. Clearly this parser cannot
 parse anything with \pcode{ccde}, therefore the empty
 set is returned.
 A bit more interesting is the \emph{sequence parser combinator}. Given
 two parsers, say again, $p$ and $q$, we want to apply first the input
 to $p$ producing a set of pairs; then apply $q$ to all the unparsed
 parts in the pairs; and then combine the results. Mathematically we would
-write something like this for the result set of pairs:
+write something like this for the set of pairs:
 \begin{center}
 \begin{tabular}{lcl}
 $\{((\textit{output}_1, \textit{output}_2), u_2)$ & $\,|\,$ &
 $(\textit{output}_1, u_1) \in p(\text{input})
 \end{tabular}
 \end{center}
 \noindent Notice that the $p$ will first be run on the input,
 producing pairs of the form $(\textit{output}_1, u_1)$ where the $u_1$
-stands for the unprocessed, or leftover, parts pf $p$. We want that
+stands for the unprocessed, or leftover, parts of $p$. We want that
 $q$ runs on all these unprocessed parts $u_1$. Therefore these
 unprocessed parts are fed into the second parser $q$. The overall
 result of the sequence parser combinator is pairs of the form
 $((\textit{output}_1, \textit{output}_2), u_2)$. This means the
-unprocessed part of the sequqnce p[arser combinator is the unprocessed
+unprocessed part of the sequqnce parser combinator is the unprocessed
-part the second parser $q$ leaves as leftover. The processed parts of
+part the second parser $q$ leaves as leftover. The parsed parts of the
-of the component parsers is a pair consisting of the outputs of $p$
+component parsers are combined in a pair, namely
-and $q$, namely $(\textit{output}_1, \textit{output}_2)$. This
+$(\textit{output}_1, \textit{output}_2)$. The reason is we want to
-behaviour can be implemented in Scala as follows:
+know what $p$ and $q$ were able to parse. This behaviour can be
+implemented in Scala as follows:
 \begin{center}
 \begin{lstlisting}[language=Scala,numbers=none]
 class SeqParser[I, T, S]
 (p: => Parser[I, T],
 }
 \end{lstlisting}
 \end{center}
 \noindent This parser takes again as arguments two parsers, \texttt{p}
-and \texttt{q}. It implements \texttt{parse} as follows: let first run
+and \texttt{q}. It implements \texttt{parse} as follows: first run the
-the parser \texttt{p} on the input producing a set of pairs
+parser \texttt{p} on the input producing a set of pairs
 (\texttt{output1}, \texttt{u1}). The \texttt{u1} stands for the
-unprocessed parts left over by \texttt{p}. Let then \texttt{q} run on these
+unprocessed parts left over by \texttt{p} (recall that there can be
-unprocessed parts producing again a set of pairs. The output of the
+several such pairs). Let then \texttt{q} run on these unprocessed
-sequence parser combinator is then a set containing pairs where the
+parts producing again a set of pairs. The output of the sequence
-first components are again pairs, namely what the first parser could
+parser combinator is then a set containing pairs where the first
-parse together with what the second parser could parse; the second
+components are again pairs, namely what the first parser could parse
-component is the unprocessed part left over after running the second
+together with what the second parser could parse; the second component
-parser \texttt{q}. Therefore the input type of the sequence parser
+is the unprocessed part left over after running the second parser
-combinator is as usual \texttt{I}, but the output type is
+\texttt{q}. Note that the input type of the sequence parser combinator
+is as usual \texttt{I}, but the output type is
 \begin{center}
 \texttt{(T, S)}
 \end{center}
 \noindent
-This means \texttt{parse} in the sequence parser combinator returns
+Consequently, the function \texttt{parse} in the sequence parser
-sets of type \texttt{Set[((T, S), I)]}.  Notice that we have
+combinator returns sets of type \texttt{Set[((T, S), I)]}.  That means
-essentially two output types for the sequence parser combinator
+we have essentially two output types for the sequence parser
-(packaged in a pair), because in general \textit{p} and \textit{q}
+combinator (packaged in a pair), because in general \textit{p} and
-might produce different things (for example first we recognise a
+\textit{q} might produce different things (for example we recognise a
-number and then a string corresponding to an operator).  If any of the
+number with \texttt{p} and then with \texttt{q} a string corresponding
-runs of \textit{p} and \textit{q} fail, that is produce the empty set,
+to an operator).  If any of the runs of \textit{p} and \textit{q}
-then \texttt{parse} will also produce the empty set.
+fail, that is produce the empty set, then \texttt{parse} will also
+produce the empty set.
 With the shorthand notation we shall introduce later for the sequence
 parser combinator, we can write for example \pcode{"a" ~ "b"}, which
 is the parser combinator that first recognises the character
 \texttt{a} from a string and then \texttt{b}. Let us look again at
-three examples of how this parser combinator processes some strings:
+some examples of how this parser combinator processes some strings:
 \begin{center}
 \begin{tabular}{rcl}
 input strings & & output\medskip\\
 \texttt{\Grid{abcde}} & $\rightarrow$ & $\left\{((\texttt{\Grid{a}}, \texttt{\Grid{b}}), \texttt{\Grid{cde}})\right\}$\\
 \noindent The second and third example fail, because something is
 ``missing'' in the sequence we are looking for. The first succeeds but
 notice how the results nest with sequences: the parsed part is a
 nested pair of the form \pcode{((a, b), c)}. If we nest the sequence
-parser differently, for example \pcode{"a" ~ ("b" ~ "c")}, then also
+parser differently, say \pcode{"a" ~ ("b" ~ "c")}, then also
 our output pairs nest differently
 \begin{center}
 \begin{tabular}{rcl}
 input strings & & output\medskip\\
 \texttt{\Grid{aaaa}} & $\rightarrow$ &
 $\left\{(((\texttt{\Grid{a}}, \texttt{\Grid{a}}), \texttt{\Grid{a}}), \texttt{\Grid{a}})\right\}$\\
 \end{tabular}
 \end{center}
-\noindent Notice how again the results nest deeper and deeper as pairs (the
+\noindent Notice again how the results nest deeper and deeper as pairs (the
 last \pcode{a} is in the unprocessed part). To consume everything of
 this string we can use the parser \pcode{(("a" ~ "a") ~ "a") ~
 "a"}. Then the output is as follows:
 \begin{center}
 nesting of the component parsers.
 Consider also carefully that constructing a parser such \pcode{"a" |
 ("a" ~ "b")} will result in a typing error. The intention with this
-parser is that we want to parse an \texttt{a}, or an \texttt{a}
+parser is that we want to parse either an \texttt{a}, or an \texttt{a}
 followed by a \texttt{b}. However, the first parser has as output type
 a single character (recall the type of \texttt{CharParser}), but the
 second parser produces a pair of characters as output. The alternative
 parser is required to have both component parsers to have the same
-type---we need to be able to build the union of two sets, which means
+type---the reason is that we need to be able to build the union of two
-in Scala they need to be of the same type.  Since ther are not, there
+sets, which requires in Scala that the sets have the same type.  Since
-is a typing error in this example.  We will see later how we can build
+they are not in this case, there is a typing error.  We will see later
-this parser without the typing error.
+how we can build this parser without the typing error.
 The next parser combinator, called \emph{semantic action}, does not
-actually combine smaller parsers, but applies a function to the result
+actually combine two smaller parsers, but applies a function to the result
 of a parser.  It is implemented in Scala as follows
 \begin{center}
 \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
 class FunParser[I, T, S]
 produces sets of type \texttt{Set[(T, I)]}. The semantic action
 combinator then applies the function \texttt{f} to all the `processed'
 parser outputs. Since this function is of type \texttt{T => S}, we
 obtain a parser with output type \texttt{S}. Again Scala lets us
 introduce some shorthand notation for this parser
-combinator. Therefore we will write \texttt{p ==> f} for it.
+combinator. Therefore we will write short \texttt{p ==> f} for it.
 What are semantic actions good for? Well, they allow you to transform
 the parsed input into datastructures you can use for further
-processing. A simple example would be to transform parsed characters
+processing. A simple (contrived) example would be to transform parsed
-into ASCII numbers. Suppose we define a function \texttt{f} (from
+characters into ASCII numbers. Suppose we define a function \texttt{f}
-characters to ints) and use a \texttt{CharParser} for parsing
+(from characters to \texttt{Int}s) and use a \texttt{CharParser} for parsing
 the character \texttt{c}.
 \begin{center}
 \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
 val f = (c: Char) => c.toInt
 val c = new CharParser('c')
 In the first line we obtain the expected result \texttt{Set(('c',
 "bd"))}, whereas the second produces \texttt{Set((99, "bd"))}---the
 character has been transformed into an ASCII number.
 A slightly less contrived example is about parsing numbers (recall
-\texttt{NumParser} above). However, we want to do this here for strings.
+\texttt{NumParser} above). However, we want to do this here for
-For this assume we have the following \texttt{RegexParser}.
+strings, not for tokens.  For this assume we have the following
+(atomic) \texttt{RegexParser}.
 \begin{center}
 \begin{lstlisting}[language=Scala,xleftmargin=0mm,
 basicstyle=\small\ttfamily, numbers=none]
 import scala.util.matching.Regex
 \noindent
 This parser takes a regex as argument and splits up a string into a
 prefix and the rest according to this regex
 (\texttt{reg.findPrefixMatchOf} generates a match---in the successful
 case---and the corresponding strings can be extracted with
-\texttt{matched} and \texttt{after}). Using this parser we can define a
+\texttt{matched} and \texttt{after}). The input and output type for
-\texttt{NumParser} for strings as follows:
+this parser is \texttt{String}. Using \texttt{RegexParser} we can
+define a \texttt{NumParser} for \texttt{Strings} to \texttt{Int} as
+follows:
 \begin{center}
 \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
 val NumParser = RegexParser("[0-9]+".r)
 \end{lstlisting}
 \end{center}
 \noindent
-This parser will recognise a number at the beginning of a string, for
+This parser will recognise a number at the beginning of a string. For
 example
 \begin{center}
 \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
 NumParser.parse("123abc")
 \end{lstlisting}
 \end{center}
 \noindent
 The function in the semantic action converts a string into an
-\texttt{Int}. Let us come back to semantic actions when we are going
+\texttt{Int}. Now \texttt{parse} generates \texttt{Set((123,abc))},
-to implement actual context-free gammars.
+but this time \texttt{123} is an \texttt{Int}. Let us come back to
+semantic actions when we are going to implement actual context-free
+gammars.
 \subsubsection*{Shorthand notation for parser combinators}
 Before we proceed, let us just explain the shorthand notation for
 parser combinators. Like for regular expressions, the shorthand notation
 will make our life much easier when writing actual parsers. We can define
-some implicits which allow us to write \pcode{p | q}, \pcode{p ~ q} and
+some implicits which allow us to write
-\pcode{p ==> f} as well as to use plain strings for specifying simple
-string parsers.
+\begin{center}
+\begin{tabular}{ll}
+\pcode{p | q} & alternative parser\\
+\pcode{p ~ q} & sequence parser\\
+\pcode{p ==> f} & semantic action parser
+\end{tabular}
+\end{center}
+\noindent
+as well as to use plain strings for specifying simple string parsers.
 The idea is that this shorthand notation allows us to easily translate
 context-free grammars into code. For example recall our context-free
 grammar for palindromes:
 \begin{plstx}[margin=3cm]
-: \meta{P} ::=  a\cdot \meta{P}\cdot a | b\cdot \meta{P}\cdot b | a | b | \epsilon\\
+: \meta{Pal} ::=  a\cdot \meta{Pal}\cdot a | b\cdot \meta{Pal}\cdot b | a | b | \epsilon\\
 \end{plstx}
 \noindent
 Each alternative in this grammar translates into an alternative parser
 combinator.  The $\cdot$ can be translated to a sequence parser
 \subsubsection*{How to build parsers using parser combinators?}
 The beauty of parser combinators is the ease with which they can be
 implemented and how easy it is to translate context-free grammars into
 code (though the grammars need to be non-left-recursive). To
-demonstrate this recall the grammar for palindromes from above.
+demonstrate this consider again the grammar for palindromes from above.
 The first idea would be to translate it into the following code
 \begin{center}
 \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
 lazy val Pal : Parser[String, String] =
 \noindent
 Unfortunately, this does not quite work yet as it produces a typing
 error. The reason is that the parsers \texttt{"a"}, \texttt{"b"} and
 \texttt{""} all produce strings as output type and therefore can be
 put into an alternative \texttt{...| "a" | "b" | ""}. But both
-\pcode{"a" ~ Pal ~ "a"} and \pcode{"b" ~ Pal ~ "b"} produce pairs of
+sequence parsers \pcode{"a" ~ Pal ~ "a"} and \pcode{"b" ~ Pal ~ "b"}
-the form $(((\_, \_), \_), \_)$---that is how the sequence parser
+produce pairs of the form
-combinator nests results when \pcode{\~} is used between two
-components. The solution is to use a semantic action that ``flattens''
+\begin{center}
-these pairs and appends the corresponding strings, like
+(((\texttt{a}-part, \texttt{Pal}-part), \texttt{a}-part), unprocessed part)
+\end{center}
+\noindent That is how the
+sequence parser combinator nests results when \pcode{\~} is used
+between two components. The solution is to use a semantic action that
+``flattens'' these pairs and appends the corresponding strings, like
 \begin{center}
 \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
 lazy val Pal : Parser[String, String] =
 (("a" ~ Pal ~ "a") ==> { case ((x, y), z) => x + y + z } |
 "a" | "b" | "")
 \end{lstlisting}
 \end{center}
 \noindent
-Now in all cases we have strings as output type for the parser variants.
+How does this work? Well, recall again what the pairs look like for
-The semantic action
+the parser \pcode{"a" ~ Pal ~ "a"}.  The pattern in the semantic
+action matches the nested pairs (the \texttt{x} with the
+\texttt{a}-part and so on).  Unfortunately when we have such nested
-\noindent
+pairs, Scala requires us to define the function using the
-Important to note is that we must define \texttt{Pal}-parser as a
+\pcode{case}-syntax
-\emph{lazy} value.
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+{ case ((x, y), z) => ... }
+\end{lstlisting}
+\end{center}
+\noindent
+If we have more sequence parser combinators or have them differently nested,
+then the pattern in the semantic action needs to be adjusted accordingly.
+The action we implement above is to concatenate all three strings, which
+means after the semantic action is applied the output type of the parser
+is \texttt{String}, which means it fits with the alternative parsers
+\texttt{...| "a" | "b" | ""}.
+If we run the parser above with \pcode{Pal.parse_all("abaaaba")} we obtain
+as result the \pcode{Set(abaaaba)}, which indicates that the string is a palindrom
+(an empty set would mean something is wrong). But also notice what the
+intermediate results are generated by \pcode{Pal.parse("abaaaba")}
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+Set((abaaaba,""),(aba,aaba), (a,baaaba), ("",abaaaba))
+\end{lstlisting}
+\end{center}
+\noindent
+That there are more than one output might be slightly unexpected, but
+can be explained as follows: the pairs represent all possible
+(partial) parses of the string \pcode{"abaaaba"}. The first pair above
+correesponds to a complete parse (all output is consumed) and this is
+what \pcode{Pal.parse_all} returns. The second pair is a small
+``sub-palindrome'' that can also be parsed, but the parse fails with
+the rest \pcode{aaba}, which is therefore left as unprocessed. The
+third one is an attempt to parse the whole string with the
+single-character parser \pcode{a}. That of course only partially
+succeeds, by leaving \pcode{"baaaba"} as the unprocessed
+part. Finally, since we allow the empty string to be a palindrom we
+also obtain the last pair, where actually nothing is consumed from the
+input string. While all this works as intended, we need to be careful
+with this (especially with including the \pcode{""} parser in our
+grammar): if during parsing the set of parsing attempts gets too big,
+then the parsing process can become very slow as the potential
+candidates for applying rules can snowball.
+Important is also to note is that we must define the
+\texttt{Pal}-parser as a \emph{lazy} value in Scala. Look again at the
+code: \texttt{Pal} occurs on the right-hand side of the definition. If we had
+just written
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+val Pal : Parser[String, String] =  ...rhs...
+\end{lstlisting}
+\end{center}
+\noindent
+then Scala before making this assignemnt to \texttt{Pal} attempts to
+find out what the expression on the right-hand side evaluates to. This
+is straightforward in case of simple expressions \texttt{2 + 3}, but
+the expression above contains \texttt{Pal} in the right-hand
+side. Without \pcode{lazy} it would try to evaluate what \texttt{Pal}
+evaluates to and start a new recursion, which means it falls into an
+infinite loop. The definition of \texttt{Pal} is recursive and the
+\pcode{lazy} key-word prevents it from being fully evaluated. Therefore
+whenever we want to define a recursive parser we have to write
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+lazy val SomeParser : Parser[...,...] =  ...rhs...
+\end{lstlisting}
+\end{center}
+\noindent That was not necessary for our atomic parsers, like
+\texttt{RegexParser} or \texttt{CharParser}, because they are not recursive.
+Note that this is also the reason why we had to write
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+class AltParser[I, T]
+(p: => Parser[I, T],
+q: => Parser[I, T]) extends Parser[I, T] {...}
+class SeqParser[I, T, S]
+(p: => Parser[I, T],
+q: => Parser[I, S]) extends Parser[I, (T, S)] {...}
+\end{lstlisting}
+\end{center}
+\noindent where the \texttt{\textbf{\textcolor{codepurple}{=>}}} in front of
+the argument types for \texttt{p} and \texttt{q} prevent Scala from
+evaluating the arguments. Normally, Scala would first evaluate what
+kind of parsers \texttt{p} and \texttt{q} are, and only then generate
+the alternative parser combinator, repsectively sequence parser
+combinator. Since the argumants can be recursive parsers, such as
+\texttt{Pal}, this would lead again to an infinite loop.
+As a final example in this section, let us consider the grammar for
+well-nested parentheses:
+\begin{plstx}[margin=3cm]
+: \meta{P} ::=  (\cdot \meta{P}\cdot ) \cdot \meta{P} | \epsilon\\
+\end{plstx}
+\noindent
+Let us assume we want to not just recognise strings of
+well-nested parentheses but also transfrom round parentheses
+into curly braces. We can do this by using a semantic
+action:
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily,
+xleftmargin=0mm, numbers=none]
+lazy val P : Parser[String, String] =
+"(" ~ P ~ ")" ~ P ==> { case (((_,x),_),y) => "{" + x + "}" + y } | ""
+\end{lstlisting}
+\end{center}
+\noindent
+Here we define a function where which ignores the parentheses in the
+pairs, but replaces them in the right places with curly braces when
+assembling the new string in the right-hand side. If we run
+\pcode{P.parse_all("(((()()))())")} we obtain
+\texttt{Set(\{\{\{\{\}\{\}\}\}\{\}\})} as expected.
 \subsubsection*{Implementing an Interpreter}
 %\bigskip
 %takes advantage of the full generality---have a look

changeset 591	863e502f6a5c
parent 590	c6a1e19e9801
child 592	0d309fafa9f0