--- a/handouts/ho05.tex Sat Oct 22 13:11:33 2016 +0100
+++ b/handouts/ho05.tex Sat Oct 22 15:18:11 2016 +0100
@@ -1,7 +1,7 @@
\documentclass{article}
\usepackage{../style}
\usepackage{../langs}
-
+\usepackage{../grammar}
\begin{document}
@@ -48,25 +48,27 @@
Context-free languages are usually specified by grammars. For
example a grammar for well-parenthesised expressions is
-\begin{center}
-$P \;\;\rightarrow\;\; ( \cdot P \cdot ) \cdot P \;|\; \epsilon$
-\end{center}
-
+\begin{plstx}[margin=3cm]
+: \meta{P} ::= ( \cdot \meta{P} \cdot ) \cdot \meta{P}
+ | \epsilon\\
+\end{plstx}
+
\noindent
or a grammar for recognising strings consisting of ones is
-\begin{center}
-$O \;\;\rightarrow\;\; 1 \cdot O \;|\; 1$
-\end{center}
+\begin{plstx}[margin=3cm]
+: \meta{O} ::= 1 \cdot \meta{O}
+ | 1\\
+\end{plstx}
In general grammars consist of finitely many rules built up
from \emph{terminal symbols} (usually lower-case letters) and
-\emph{non-terminal symbols} (upper-case letters). Rules have
+\emph{non-terminal symbols} (upper-case letters inside \meta{\mbox{}}). Rules have
the shape
-\begin{center}
-$NT \;\;\rightarrow\;\; \textit{rhs}$
-\end{center}
+\begin{plstx}[margin=3cm]
+: \meta{NT} ::= rhs\\
+\end{plstx}
\noindent where on the left-hand side is a single non-terminal
and on the right a string consisting of both terminals and
@@ -77,29 +79,29 @@
convention to use $|$ as a shorthand notation for several
rules. For example
-\begin{center}
-$NT \;\;\rightarrow\;\; \textit{rhs}_1 \;|\; \textit{rhs}_2$
-\end{center}
+\begin{plstx}[margin=3cm]
+: \meta{NT} ::= rhs_1
+ | rhs_2\\
+\end{plstx}
-\noindent means that the non-terminal $NT$ can be replaced by
+\noindent means that the non-terminal \meta{NT} can be replaced by
either $\textit{rhs}_1$ or $\textit{rhs}_2$. If there are more
than one non-terminal on the left-hand side of the rules, then
we need to indicate what is the \emph{starting} symbol of the
grammar. For example the grammar for arithmetic expressions
can be given as follows
-\begin{center}
-\begin{tabular}{lcl@{\hspace{2cm}}l}
-$E$ & $\rightarrow$ & $N$ & (1)\\
-$E$ & $\rightarrow$ & $E \cdot + \cdot E$ & (2)\\
-$E$ & $\rightarrow$ & $E \cdot - \cdot E$ & (3)\\
-$E$ & $\rightarrow$ & $E \cdot * \cdot E$ & (4)\\
-$E$ & $\rightarrow$ & $( \cdot E \cdot )$ & (5)\\
-$N$ & $\rightarrow$ & $N \cdot N \;|\; 0 \;|\; 1 \;|\: \ldots \;|\; 9$ & (6\ldots)
-\end{tabular}
-\end{center}
+\begin{plstx}[margin=3cm,one per line]
+\mbox{\rm (1)}: \meta{E} ::= \meta{N}\\
+\mbox{\rm (2)}: \meta{E} ::= \meta{E} \cdot + \cdot \meta{E}\\
+\mbox{\rm (3)}: \meta{E} ::= \meta{E} \cdot - \cdot \meta{E}\\
+\mbox{\rm (4)}: \meta{E} ::= \meta{E} \cdot * \cdot \meta{E}\\
+\mbox{\rm (5)}: \meta{E} ::= ( \cdot \meta{E} \cdot )\\
+\mbox{\rm (6\ldots)}: \meta{N} ::= \meta{N} \cdot \meta{N}
+ \mid 0 \mid 1 \mid \ldots \mid 9\\
+\end{plstx}
-\noindent where $E$ is the starting symbol. A
+\noindent where \meta{E} is the starting symbol. A
\emph{derivation} for a grammar starts with the starting
symbol of the grammar and in each step replaces one
non-terminal by a right-hand side of a rule. A derivation ends
@@ -109,12 +111,12 @@
\begin{center}
\begin{tabular}{lll@{\hspace{2cm}}l}
-$E$ & $\rightarrow$ & $E+E$ & by (2)\\
- & $\rightarrow$ & $(E)+E$ & by (5)\\
- & $\rightarrow$ & $(E+E)+E$ & by (2)\\
- & $\rightarrow$ & $(E+E)+N$ & by (1)\\
- & $\rightarrow$ & $(E+E)+3$ & by (6\dots)\\
- & $\rightarrow$ & $(N+E)+3$ & by (1)\\
+\meta{E} & $\rightarrow$ & $\meta{E}+\meta{E}$ & by (2)\\
+ & $\rightarrow$ & $(\meta{E})+\meta{E}$ & by (5)\\
+ & $\rightarrow$ & $(\meta{E}+\meta{E})+\meta{E}$ & by (2)\\
+ & $\rightarrow$ & $(\meta{E}+\meta{E})+\meta{N}$ & by (1)\\
+ & $\rightarrow$ & $(\meta{E}+\meta{E})+3$ & by (6\dots)\\
+ & $\rightarrow$ & $(\meta{N}+\meta{E})+3$ & by (1)\\
& $\rightarrow^+$ & $(1+2)+3$ & by (1, 6\ldots)\\
\end{tabular}
\end{center}
@@ -239,6 +241,9 @@
undecidable. But in simple instance (the ones we deal in this
module) one can usually see when a grammar is ambiguous.
+
+\subsection*{Parser Combinators}
+
Let us now turn to the problem of generating a parse-tree for
a grammar and string. In what follows we explain \emph{parser
combinators}, because they are easy to implement and closely