handouts/ho06.tex
changeset 587 5ddedcd92d84
parent 586 451a95e1bc25
child 588 a4646557016d
--- a/handouts/ho06.tex	Wed Oct 24 16:03:07 2018 +0100
+++ b/handouts/ho06.tex	Wed Oct 24 20:37:37 2018 +0100
@@ -9,17 +9,21 @@
 \section*{Handout 6 (Parser Combinators)}
 
 This handout explains how \emph{parser combinators} work and how they
-can be implemented in Scala. Their distinguishing feature is that they
-are very easy to implement (admittedly it is only easy in a functional
-programming language). However, parser combinators require that the
-grammar to be parsed is \emph{not} left-recursive and they are
-efficient only when the grammar is unambiguous. It is the
-responsibility of the grammar designer to ensure these two properties.
+can be implemented in Scala. Their most distinguishing feature is that
+they are very easy to implement (admittedly it is only easy in a
+functional programming language).  Another good point of parser
+combinators is that they can deal with any kind of input as long as
+this input is of ``sequence-kind'', for example a string or a list of
+tokens. The only two properties of the input we need is to be able to
+test when it is empty and ``sequentially'' take it apart. Strings and
+lists fit this bill. However, parser combinators also have their
+drawbacks. For example they require that the grammar to be parsed is
+\emph{not} left-recursive and they are efficient only when the grammar
+is unambiguous. It is the responsibility of the grammar designer to
+ensure these two properties.
 
-Another good point of parser combinators is that they can deal with any kind
-of input as long as this input of ``sequence-kind'', for example a
-string or a list of tokens. The general idea behind parser combinators
-is to transform the input into sets of pairs, like so
+The general idea behind parser combinators is to transform the input
+into sets of pairs, like so
 
 \begin{center}
 $\underbrace{\text{list of tokens}}_{\text{input}}$ 
@@ -27,32 +31,33 @@
 $\underbrace{\text{set of (parsed input, unparsed input)}}_{\text{output}}$
 \end{center} 
 
-\noindent As said, the input can be anything as long as it is a
-``sequence''. The only property of the input we need is to be able to
-test when it is empty. Obviously we can do this for strings and lists.
-For more lucidity we shall below often use strings as input in order
-to illustrate matters. However, this does not make our previous work
-on lexers obsolete (remember they transform a string into a list of
-tokens). Lexers will still be needed to build a somewhat realistic
+\noindent
+Given the extended effort we have spent in order to implement a lexer
+in order to generate list of tokens, it might be surprising that in
+what follows we shall often use strings as input. This is for making
+the explanation more lucid. It does not make our previous work on
+lexers obsolete (remember they transform a string into a list of
+tokens). Lexers will still be needed for building a somewhat realistic
 compiler.
 
-
-In my Scala code I use the following polymorphic types for parser combinators:
+But as said, parser combinators are relatively agnostic about what
+kind of input they process. In my Scala code I use the following
+polymorphic types for parser combinators:
 
 \begin{center}
 input:\;\; \texttt{I}  \qquad output:\;\; \texttt{T}
 \end{center}
 
-\noindent that is they take as input something of type \texttt{I} and
-return a set of pairs of \texttt{Set[(T, I)]}. Since the input needs
-to be of ``sequence-kind'' I actually have to often write \texttt{I
-  <\% Seq[\_]} for the input type in order to indicate it is a
+\noindent That is they take as input something of type \texttt{I} and
+return a set of pairs of type \texttt{Set[(T, I)]}. Since the input needs
+to be of ``sequence-kind'', I actually have to often write \texttt{I
+  <\% Seq[\_]} for the input type. This ensures the input is a
 subtype of Scala sequences. The first component of the generated pairs
 corresponds to what the parser combinator was able to process from the
 input and the second is the unprocessed part of the input (therefore
 the type of this unprocessed part is the same as the input). As we
 shall see shortly, a parser combinator might return more than one such
-pair; the idea being that there are potentially several ways of how to
+pair; the idea is that there are potentially several ways of how to
 parse the input.  As a concrete example, consider the string
 
 \begin{center}
@@ -68,24 +73,24 @@
            \left(\texttt{\Grid{iffoo}}\;,\; \texttt{\Grid{\VS testbar}}\right) \right\}$
 \end{center}
 
-\noindent where the first pair means the parser could
-recognise \texttt{if} from the input and leaves the rest as
-`unprocessed' as the second component of the pair; in the
-other case it could recognise \texttt{iffoo} and leaves
-\texttt{\VS testbar} as unprocessed. If the parser cannot
-recognise anything from the input, then parser combinators just
-return the empty set $\{\}$. This will indicate something
-``went wrong''\ldots or more precisely, nothing could be
+\noindent where the first pair means the parser could recognise
+\texttt{if} from the input and leaves the rest as `unprocessed' as the
+second component of the pair; in the other case it could recognise
+\texttt{iffoo} and leaves \texttt{\VS testbar} as unprocessed. If the
+parser cannot recognise anything from the input at all, then parser
+combinators just return the empty set $\{\}$. This will indicate
+something ``went wrong''\ldots or more precisely, nothing could be
 parsed.
 
 Also important to note is that the type \texttt{T} for the processed
-part is different from the input type. The reason is that in general
-we are interested in transform our input into something
-``different''\ldots for example into a tree, or if we implement the
-grammar for arithmetic expressions we might be interested in the
+part is different from the input type. In the example above is just
+happens to be the same. The reason for the difference is that in
+general we are interested in transforming our input into something
+``different''\ldots for example into a tree; or if we implement the
+grammar for arithmetic expressions, we might be interested in the
 actual integer number the arithmetic expression, say \texttt{1 + 2 *
   3}, stands for. In this way we can use parser combinators to
-implement relatively easily a calculator.
+implement relatively easily a calculator, for instance.
 
 The main idea of parser combinators is that we can easily build parser
 combinators out of smaller components following very closely the
@@ -130,23 +135,24 @@
 
 \noindent
 The input type of this simple parser combinator for characters is
-\texttt{String} and the output type \mbox{\texttt{Set[(Char, String)]}}. 
+\texttt{String} and the output type is \texttt{Char}. This means
+\texttt{parse} returns  \mbox{\texttt{Set[(Char, String)]}}. 
 The code in Scala is as follows:
 
 \begin{center}
 \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
 case class CharParser(c: Char) extends Parser[String, Char] {
-  def parse(sb: String) = 
-    if (sb.head == c) Set((c, sb.tail)) else Set()
+  def parse(in: String) = 
+    if (in.head == c) Set((c, in.tail)) else Set()
 }
 \end{lstlisting}
 \end{center}
 
-\noindent You can see the \texttt{parse} function tests whether the
-first character of the input string \texttt{sb} is equal to
+\noindent You can see the \texttt{parse} tests whether the
+first character of the input string \texttt{in} is equal to
 \texttt{c}. If yes, then it splits the string into the recognised part
-\texttt{c} and the unprocessed part \texttt{sb.tail}. In case
-\texttt{sb} does not start with \texttt{c} then the parser returns the
+\texttt{c} and the unprocessed part \texttt{in.tail}. In case
+\texttt{in} does not start with \texttt{c} then the parser returns the
 empty set (in Scala \texttt{Set()}). Since this parser recognises
 characters and just returns characters as the processed part, the
 output type of the parser is \texttt{Char}.
@@ -174,16 +180,105 @@
 rather extracts the string \texttt{s} from the token and converts it
 into an integer. The hope is that the lexer did its work well and this
 conversion always succeeds. The consequence of this is that the output
-type for this parser is \texttt{Int}. Such a conversion would be
-needed if we want to implement a simple calculator program.
+type for this parser is \texttt{Int}, not \texttt{Token}. Such a
+conversion would be needed if we want to implement a simple calculator
+program.
 
 
 These simple parsers that just look at the input and do a simple
 transformation are often called \emph{atomic} parser combinators.
 More interesting are the parser combinators that build larger parsers
-out of smaller component parsers. For example the \emph{alternative
+out of smaller component parsers. There are three such parser
+combinators that can be implemented generically. The \emph{alternative
   parser combinator} is as follows: given two parsers, say, $p$ and
 $q$, we apply both parsers to the input (remember parsers are
+functions) and combine the output (remember they are sets of pairs):
+ 
+\begin{center}
+$p(\text{input}) \cup q(\text{input})$
+\end{center}
+
+\noindent In Scala we can implement alternative parser
+combinator as follows
+
+\begin{center}
+\begin{lstlisting}[language=Scala, numbers=none]
+class AltParser[I, T]
+       (p: => Parser[I, T], 
+        q: => Parser[I, T]) extends Parser[I, T] {
+  def parse(in: I) = p.parse(in) ++ q.parse(in)
+}
+\end{lstlisting}
+\end{center}
+
+\noindent The types of this parser combinator are again generic (we
+have \texttt{I} for the input type, and \texttt{T} for the output
+type). The alternative parser builds a new parser out of two existing
+parsers \texttt{p} and \texttt{q} given as arguments.  Both parsers
+need to be able to process input of type \texttt{I} and return in
+\texttt{parse} the same output type \texttt{Set[(T,
+  I)]}.\footnote{There is an interesting detail of Scala, namely the
+  \texttt{=>} in front of the types of \texttt{p} and \texttt{q}. They
+  will prevent the evaluation of the arguments before they are
+  used. This is often called \emph{lazy evaluation} of the
+  arguments. We will explain this later.}  The alternative parser
+should run the input with the first parser \texttt{p} (producing a set
+of pairs) and then run the same input with \texttt{q} (producing
+another set of pairs).  The result should be then just the union of
+both sets, which is the operation \texttt{++} in Scala.
+
+The alternative parser combinator allows us to construct a parser that
+parses either a character \texttt{a} or \texttt{b} using the
+\texttt{CharParser} shown above. For this we can write
+
+\begin{center}
+\begin{lstlisting}[language=Scala, numbers=none]
+new AltParser(CharParser('a'), CharParser('b'))
+\end{lstlisting}
+\end{center}
+
+\noindent Later on we will use Scala mechanism for introducing some
+more readable shorthand notation for this, like \texttt{"a" ||
+  "b"}. Let us look in detail at what this parser combinator produces
+with some sample strings
+
+\begin{center}
+\begin{tabular}{rcl}
+input strings & & output\medskip\\
+\texttt{\Grid{acde}} & $\rightarrow$ & $\left\{(\texttt{\Grid{a}}, \texttt{\Grid{cde}})\right\}$\\
+\texttt{\Grid{bcde}} & $\rightarrow$ & $\left\{(\texttt{\Grid{b}}, \texttt{\Grid{cde}})\right\}$\\
+\texttt{\Grid{ccde}} & $\rightarrow$ & $\{\}$
+\end{tabular}
+\end{center}
+
+\noindent We receive in the first two cases a successful
+output (that is a non-empty set). In each case, either
+\pcode{a} or \pcode{b} is in the processed part, and
+\pcode{cde} in the unprocessed part. Clearly this parser cannot
+parse anything with \pcode{ccde}, therefore the empty
+set is returned.
+
+A bit more interesting is the \emph{sequence parser combinator}. Given
+two parsers, say again, $p$ and $q$, we want to apply first the input
+to $p$ producing a set of pairs; then apply $q$ to all the unparsed
+parts in the pairs; and then combine the results. Mathematically we would
+write something like this for the expected set of pairs:
+
+\begin{center}
+\begin{tabular}{lcl}
+$\{((\textit{output}_1, \textit{output}_2), u_2)$ & $\,|\,$ & 
+$(\textit{output}_1, u_1) \in p(\text{input}) 
+\;\wedge\;$\\
+&& $(\textit{output}_2, u_2) \in q(u_1)\}$
+\end{tabular}
+\end{center}
+
+\noindent Notice that the $p$ wil first be run on the input, producing
+pairs of the form $(\textit{output}_1, u_1)$ where the $u_1$ stands
+for the unprocessed, or left-over, parts. We want that $q$ runs on all
+these unprocessed parts $u_1$. This again will produce some
+processed part , $p$ and
+$q$, we apply both parsers to the input (remember parsers are
 functions) and combine the output (remember they are sets of pairs)
  
 \begin{center}
@@ -198,7 +293,7 @@
 class AltParser[I, T]
        (p: => Parser[I, T], 
         q: => Parser[I, T]) extends Parser[I, T] {
-  def parse(sb: I) = p.parse(sb) ++ q.parse(sb)
+  def parse(in: I) = p.parse(in) ++ q.parse(in)
 }
 \end{lstlisting}
 \end{center}
@@ -262,97 +357,14 @@
 \end{tabular}
 \end{center}
 
-\noindent Notice that the $p$ wil first be run on the input, producing
-pairs of the form $(\textit{output}_1, u_1)$ where the $u_1$ stands
-for the unprocessed, or left-over, parts. We want that $q$ runs on all
-these unprocessed parts $u_1$. This again will produce some
-processed part , $p$ and
-$q$, we apply both parsers to the input (remember parsers are
-functions) and combine the output (remember they are sets of pairs)
- 
-\begin{center}
-$p(\text{input}) \cup q(\text{input})$
-\end{center}
-
-\noindent In Scala we would implement alternative parser
-combinator as follows
-
-\begin{center}
-\begin{lstlisting}[language=Scala, numbers=none]
-class AltParser[I, T]
-       (p: => Parser[I, T], 
-        q: => Parser[I, T]) extends Parser[I, T] {
-  def parse(sb: I) = p.parse(sb) ++ q.parse(sb)
-}
-\end{lstlisting}
-\end{center}
-
-\noindent The types of this parser combinator are again generic (we
-just have \texttt{I} for the input type, and \texttt{T} for the output
-type). The alternative parser builds a new parser out of two existing
-parsers \texttt{p} and \texttt{q}.  Both need to be able to process
-input of type \texttt{I} and return the same output type
-\texttt{Set[(T, I)]}.\footnote{There is an interesting detail of
-  Scala, namely the \texttt{=>} in front of the types of \texttt{p}
-  and \texttt{q}. They will prevent the evaluation of the arguments
-  before they are used. This is often called \emph{lazy evaluation} of
-  the arguments. We will explain this later.}  Therefore the output
-type of this parser is \texttt{T}. The alternative parser should run
-the input with the first parser \texttt{p} (producing a set of pairs)
-and then run the same input with \texttt{q} (producing another set of
-pairs).  The result should be then just the union of both sets, which
-is the operation \texttt{++} in Scala.
-
-The alternative parser combinator already allows us to
-construct a parser that parses either a character \texttt{a}
-or \texttt{b}, as
-
-\begin{center}
-\begin{lstlisting}[language=Scala, numbers=none]
-new AltParser(CharParser('a'), CharParser('b'))
-\end{lstlisting}
-\end{center}
-
-\noindent Later on we will use again Scala mechanism for introducing some
-more readable shorthand notation for this, like \texttt{"a" || "b"}. Let us look in detail at what this parser combinator produces with some sample strings 
-
-\begin{center}
-\begin{tabular}{rcl}
-input strings & & output\medskip\\
-\texttt{\Grid{acde}} & $\rightarrow$ & $\left\{(\texttt{\Grid{a}}, \texttt{\Grid{cde}})\right\}$\\
-\texttt{\Grid{bcde}} & $\rightarrow$ & $\left\{(\texttt{\Grid{b}}, \texttt{\Grid{cde}})\right\}$\\
-\texttt{\Grid{ccde}} & $\rightarrow$ & $\{\}$
-\end{tabular}
-\end{center}
-
-\noindent We receive in the first two cases a successful
-output (that is a non-empty set). In each case, either
-\pcode{a} or \pcode{b} is in the processed part, and
-\pcode{cde} in the unprocessed part. Clearly this parser cannot
-parse anything in the string \pcode{ccde}, therefore the empty
-set.
-
-A bit more interesting is the \emph{sequence parser combinator}. Given
-two parsers, say again, $p$ and $q$, we want to apply first the input
-to $p$ producing a set of pairs; then apply $q$ to all the unparsed
-parts in the pairs; and then combine the results like
-
-\begin{center}
-\begin{tabular}{lcl}
-$\{((\textit{output}_1, \textit{output}_2), u_2)$ & $\;|\;$ & 
-$(\textit{output}_1, u_1) \in p(\text{input}) 
-\;\wedge\;$\\
-&& $(\textit{output}_2, u_2) \in q(u_1)\}$
-\end{tabular}
-\end{center}
-
-\noindent Notice that the $p$ wil first be run on the input, producing
-pairs of the form $\textit{output}_1$ and unprocessed part $u_1$. The
+\noindent Notice that the $p$ will first be run on the input,
+producing pairs of the form $\textit{output}_1$ and unprocessed part
+$u_1$. This unprocessed part is fed into the second parser $q$. The
 overall result of the sequence parser combinator is pairs of the form
 $((\textit{output}_1, \textit{output}_2), u_2)$. This means the
-unprocessed parts of both parsers is the unprocessed parts the second
-parser $q$ produces as left-over. The processed parts of both parsers
-is just the pair of the outputs
+unprocessed part of both parsers is the unprocessed part the second
+parser $q$ produces leaves as left-over. The processed parts of both
+parsers is a pair consisting of the outputs of $p$ and $q$, namely
 $(\textit{output}_1, \textit{output}_2)$. This behaviour can be
 implemented in Scala as follows:
 
@@ -361,28 +373,26 @@
 class SeqParser[I, T, S]
        (p: => Parser[I, T], 
         q: => Parser[I, S]) extends Parser[I, (T, S)] {
-  def parse(sb: I) = 
-    for ((output1, u1) <- p.parse(sb); 
+  def parse(in: I) = 
+    for ((output1, u1) <- p.parse(in); 
          (output2, u2) <- q.parse(u1)) 
             yield ((output1, output2), u2)
 }
 \end{lstlisting}
 \end{center}
 
-\noindent This parser takes again as input two parsers, \texttt{p}
-and \texttt{q}. It implements \texttt{parse} as follows: let
-first run the parser \texttt{p} on the input producing a set
-of pairs (\texttt{output1}, \texttt{u1}). The \texttt{u1}
-stands for the unprocessed parts left over by \texttt{p}. Let
-\texttt{q} run on these unprocessed parts producing again a
-set of pairs. The output of the sequence parser combinator is
-then a set containing pairs where the first components are
-again pairs, namely what the first parser could parse together
-with what the second parser could parse; the second component
-is the unprocessed part left over after running the second
-parser \texttt{q}. Therefore the input type of the sequence
-parser combinator is as usual \texttt{I}, but the output type
-is
+\noindent This parser takes again as arguments two parsers, \texttt{p}
+and \texttt{q}. It implements \texttt{parse} as follows: let first run
+the parser \texttt{p} on the input producing a set of pairs
+(\texttt{output1}, \texttt{u1}). The \texttt{u1} stands for the
+unprocessed parts left over by \texttt{p}. Let then \texttt{q} run on these
+unprocessed parts producing again a set of pairs. The output of the
+sequence parser combinator is then a set containing pairs where the
+first components are again pairs, namely what the first parser could
+parse together with what the second parser could parse; the second
+component is the unprocessed part left over after running the second
+parser \texttt{q}. Therefore the input type of the sequence parser
+combinator is as usual \texttt{I}, but the output type is
 
 \begin{center}
 \texttt{Set[((T, S), I)]}
@@ -396,13 +406,11 @@
 different things (for example first we recognise a number and then a
 string corresponding to an operator).
 
-
-We have not yet looked at this in detail, but Scala allows us to
-provide some shorthand notation for the sequence parser combinator. We
-can write for example \pcode{"a" ~ "b"}, which is the parser
-combinator that first recognises the character \texttt{a} from a
-string and then \texttt{b}. Let us look again at three examples of
-how this parser combinator processes strings:
+With the shorthand notation we shall introduce later for the sequence
+parser combinator, we can write for example \pcode{"a" ~ "b"}, which
+is the parser combinator that first recognises the character
+\texttt{a} from a string and then \texttt{b}. Let us look again at
+three examples of how this parser combinator processes some strings:
 
 \begin{center}
 \begin{tabular}{rcl}
@@ -414,18 +422,18 @@
 \end{center}
 
 \noindent In the first line we have a successful parse, because the
-string started with \texttt{ab}, which is the prefix we are looking
+string starts with \texttt{ab}, which is the prefix we are looking
 for. But since the parsing combinator is constructed as sequence of
 the two simple (atomic) parsers for \texttt{a} and \texttt{b}, the
 result is a nested pair of the form \texttt{((a, b), cde)}. It is
 \emph{not} a simple pair \texttt{(ab, cde)} as one might erroneously
-expects.  The parser returns the empty set in the other examples,
+expect.  The parser returns the empty set in the other examples,
 because they do not fit with what the parser is supposed to parse.
 
 
-A slightly more complicated parser is \pcode{("a" || "b") ~ "c"}
-which parses as first character either an \texttt{a} or \texttt{b}
-followed by a \texttt{c}. This parser produces the following outputs.
+A slightly more complicated parser is \pcode{("a" || "b") ~ "c"} which
+parses as first character either an \texttt{a} or \texttt{b}, followed
+by a \texttt{c}. This parser produces the following outputs.
 
 \begin{center}
 \begin{tabular}{rcl}
@@ -465,7 +473,7 @@
 \end{tabular}
 \end{center}
 
-\noindent Notice how the results nests deeper and deeper as pairs (the
+\noindent Notice how again the results nest deeper and deeper as pairs (the
 last \pcode{a} is in the unprocessed part). To consume everything of
 this string we can use the parser \pcode{(("a" ~ "a") ~ "a") ~
   "a"}. Then the output is as follows:
@@ -480,7 +488,7 @@
 
 \noindent This is an instance where the parser consumed
 completely the input, meaning the unprocessed part is just the
-empty string. So if we called \pcode{parse_all} instead of \pcode{parse}
+empty string. So if we called \pcode{parse_all}, instead of \pcode{parse},
 we would get back the result
 
 \[
@@ -489,29 +497,32 @@
 
 \noindent where the unprocessed (empty) parts have been stripped away
 from the pairs; everything where the second part was not empty has
-been thrown away as ultimately-unsuccessful-parses.
+been thrown away as well, because they represent
+ultimately-unsuccessful-parses.
 
 
-Note carefully that constructing a parser such \pcode{'a' ||
-('a' ~ 'b')} will result in a typing error. The first
-parser has as output type a single character (recall the type
-of \texttt{CharParser}), but the second parser produces a pair
-of characters as output. The alternative parser is however
-required to have both component parsers to have the same type.
-We will see later how we can build this parser without the
-typing error.
+Note carefully that constructing a parser such \pcode{"a" || ("a" ~
+  "b")} will result in a typing error. The intention is that we want
+to parse an \texttt{a}, or an \texttt{a} followed by a
+\texttt{b}. However, the first parser has as output type a single
+character (recall the type of \texttt{CharParser}), but the second
+parser produces a pair of characters as output. The alternative parser
+is required to have both component parsers to have the same type---we
+need to be able to build the union of two sets, which means in Scala
+they need to be of the same type.  We will see later how we can build
+this parser without the typing error.
 
-The next parser combinator does not actually combine smaller
-parsers, but applies a function to the result of a parser.
-It is implemented in Scala as follows
+The next parser combinator, called \emph{semantic action}, does not
+actually combine smaller parsers, but applies a function to the result
+of a parser.  It is implemented in Scala as follows
 
 \begin{center}
 \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
 class FunParser[I, T, S]
          (p: => Parser[I, T], 
           f: T => S) extends Parser[I, S] {
-  def parse(sb: I) = 
-    for ((head, tail) <- p.parse(sb)) yield (f(head), tail)
+  def parse(in: I) = 
+    for ((head, tail) <- p.parse(in)) yield (f(head), tail)
 }
 \end{lstlisting}
 \end{center}
@@ -521,13 +532,46 @@
 with output type \texttt{T} as one argument as well as a
 function \texttt{f} with type \texttt{T => S}. The parser
 \texttt{p} produces sets of type \texttt{(T, I)}. The
-\texttt{FunParser} combinator then applies the function
+semantic action combinastor then applies the function
 \texttt{f} to all the parser outputs. Since this function is of
 type \texttt{T => S}, we obtain a parser with output type
 \texttt{S}. Again Scala lets us introduce some shorthand
 notation for this parser combinator. Therefore we will write
 \texttt{p ==> f} for it.
 
+What are semantic actions good for? Ultimately, they allow is to
+transform the parsed input into a datastructure we can use for further
+processing. A simple example would be to transform parsed characters
+into ASCII numbers. Suppose we define a function \texttt{f} (from
+characters to ints) and a \texttt{CharParser}.
+
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+val f = (c: Char) => c.toInt
+val c = new CharParser('c')
+\end{lstlisting}
+\end{center}
+
+\noindent
+Then
+
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+c.parse("cbd")
+(c ==> f).parse("cbd")
+\end{lstlisting}
+\end{center}
+
+\noindent
+the first line produces \texttt{Set(('c', "bd"))}, whereas the second
+produces \texttt{Set((99, "bd"))}.
+
+\subsubsection*{Shorthand notation for parser combinators}
+
+Before we proceed, let us just explain the shorthand notation for
+parser combinators. Like for regular expressions, the shorthand notation
+will make our life much easier when writing actual parsers.
+
 \subsubsection*{How to build parsers using parser combinators?}
 
 \subsubsection*{Implementing an Interpreter}