--- a/handouts/ho06.tex Wed Oct 24 20:37:37 2018 +0100
+++ b/handouts/ho06.tex Thu Oct 25 00:50:58 2018 +0100
@@ -2,7 +2,7 @@
\documentclass{article}
\usepackage{../style}
\usepackage{../langs}
-
+\usepackage{../grammar}
\begin{document}
@@ -539,11 +539,11 @@
notation for this parser combinator. Therefore we will write
\texttt{p ==> f} for it.
-What are semantic actions good for? Ultimately, they allow is to
-transform the parsed input into a datastructure we can use for further
+What are semantic actions good for? Well, they allow is to transform
+the parsed input into a datastructure we can use for further
processing. A simple example would be to transform parsed characters
into ASCII numbers. Suppose we define a function \texttt{f} (from
-characters to ints) and a \texttt{CharParser}.
+characters to ints) and use a \texttt{CharParser} for the character \texttt{c}.
\begin{center}
\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
@@ -553,7 +553,7 @@
\end{center}
\noindent
-Then
+Then we can run the following parsers on the input \texttt{cbd}:
\begin{center}
\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
@@ -563,8 +563,67 @@
\end{center}
\noindent
-the first line produces \texttt{Set(('c', "bd"))}, whereas the second
-produces \texttt{Set((99, "bd"))}.
+The first line we obtain the result \texttt{Set(('c', "bd"))}, whereas the second
+produces \texttt{Set((99, "bd"))}---the character has been transformed into
+an ASCII number.
+
+A slightly less contrived example is about parsing numbers (recall
+\texttt{NumParser} above). However, we want to do this here for strings.
+For this assume we have the following \texttt{RegexParser}.
+
+\begin{center}
+ \begin{lstlisting}[language=Scala,xleftmargin=0mm,
+ basicstyle=\small\ttfamily, numbers=none]
+import scala.util.matching.Regex
+
+case class RegexParser(reg: Regex) extends Parser[String, String] {
+ def parse(in: String) = reg.findPrefixMatchOf(in) match {
+ case None => Set()
+ case Some(m) => Set((m.matched, m.after.toString))
+ }
+}
+\end{lstlisting}
+\end{center}
+
+\noindent
+This parser takes a regex as argument and splits up a string into a
+prefix and the rest according to this regex
+(\texttt{reg.findPrefixMatchOf} generates a match---in the successful
+case---and the corresponding strings can be extracted with
+\texttt{matched} and \texttt{after}). We can now define a
+\texttt{NumParser} for strings as follows:
+
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+val NumParser = RegexParser("[0-9]+".r)
+\end{lstlisting}
+\end{center}
+
+\noindent
+This parser will recognise a number at the beginning of a string, for
+example
+
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+NumParser.parse("123abc")
+\end{lstlisting}
+\end{center}
+
+\noindent
+produces \texttt{Set((123,abc))}. The problem is that \texttt{123} is
+still a string. We need to convert it into the corresponding
+\texttt{Int}. We can do this as follows
+
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+(NumParser ==> (s => s.toInt)).parse("123abc")
+\end{lstlisting}
+\end{center}
+
+\noindent
+The semantic action in form of a function converts a string into an
+\texttt{Int}. Let us come back to semantic actions when we are going
+to implement a simple calculator.
\subsubsection*{Shorthand notation for parser combinators}
@@ -574,6 +633,47 @@
\subsubsection*{How to build parsers using parser combinators?}
+The beauty of parser combinators is the ease with which they can be
+implemented and how easy it is to translate context-free grammars into
+code (though the grammars need to be non-left-recursive). To
+demonstrate this recall our context-free grammar for palindromes:
+
+\begin{plstx}[margin=3cm]
+: \meta{P} ::= a\cdot \meta{P}\cdot a | b\cdot \meta{P}\cdot b | a | b | \epsilon\\
+\end{plstx}
+
+\noindent
+Given the parser combinators for alternatives and sequeneces, this grammar should be
+straightforward to implement. The first idea would be
+
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+lazy val Pal : Parser[String, String] =
+ (("a" ~ Pal ~ "a") | ("b" ~ Pal ~ "b") | "a" | "b" | "")
+\end{lstlisting}
+\end{center}
+
+\noindent
+Unfortunately, this does not work as it produces a typing error. The
+reason is that the parsers \texttt{"a"}, \texttt{"b"} and \texttt{""}
+all produce strings and therefore can be put into an alternative
+\texttt{...| "a" | "b" | ""}. But both \pcode{"a" ~ Pal ~ "a"} and
+\pcode{"b" ~ Pal ~ "b"} produce pairs of the form
+$(((\_, \_), \_), \_)$---that is how the sequence parser combinator
+nests results when \pcode{\~} is used between two components. The
+solution is to use a semantic action that ``flattens'' these pairs and
+appends the corresponding strings, like
+
+\begin{center}
+\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none]
+lazy val Pal : Parser[String, String] =
+ (("a" ~ Pal ~ "a") ==> { case ((x, y), z) => x + y + z } |
+ ("b" ~ Pal ~ "b") ==> { case ((x, y), z) => x + y + z } |
+ "a" | "b" | "")
+\end{lstlisting}
+\end{center}
+
+
\subsubsection*{Implementing an Interpreter}
%\bigskip