# HG changeset patch # User Christian Urban # Date 1540566850 -3600 # Node ID 0d309fafa9f004a5f62c55bae355c41175161f69 # Parent 863e502f6a5cb97ea8852f5f5924dbd8c68634df updated diff -r 863e502f6a5c -r 0d309fafa9f0 handouts/ho06.pdf Binary file handouts/ho06.pdf has changed diff -r 863e502f6a5c -r 0d309fafa9f0 handouts/ho06.tex --- a/handouts/ho06.tex Fri Oct 26 13:28:33 2018 +0100 +++ b/handouts/ho06.tex Fri Oct 26 16:14:10 2018 +0100 @@ -785,18 +785,72 @@ \subsubsection*{Implementing an Interpreter} -%\bigskip -%takes advantage of the full generality---have a look -%what it produces if we call it with the string \texttt{abc} -% -%\begin{center} -%\begin{tabular}{rcl} -%input string & & output\medskip\\ -%\texttt{\Grid{abc}} & $\rightarrow$ & $\left\{((\texttt{\Grid{a}}, \texttt{\Grid{b}}), \texttt{\Grid{c}})\right\}$\\ -%\texttt{\Grid{bbc}} & $\rightarrow$ & $\left\{((\texttt{\Grid{b}}, \texttt{\Grid{b}}), \texttt{\Grid{c}})\right\}$\\ -%\texttt{\Grid{aac}} & $\rightarrow$ & $\varnothing$ -%\end{tabular} -%\end{center} +The first step before implementing an interpreter for fullblown +language is to implement a simple calculator for arithmetic +expressions. Suppose our arithmetic expressions are given by the +grammar: + +\begin{plstx}[margin=3cm,one per line] +: \meta{E} ::= + | \meta{E} \cdot + \cdot \meta{E} + | \meta{E} \cdot - \cdot \meta{E} + | \meta{E} \cdot * \cdot \meta{E} + | ( \cdot \meta{E} \cdot ) + | Number \\ +\end{plstx} + +\noindent +Naturally we want to implement the grammar in such a way that we can +calculate what the result of \texttt{4*2+3} is---we are interested in +an \texttt{Int} rather than a string. This means every component +parser needs to have as output type \texttt{Int} and when we assemble +the intermediate results, strings like \texttt{"+"}, \texttt{"*"} and +so on, need to be translated into the appropriate Scala operation. +Being inspired by the parser for well-nested parentheses and ignoring +the fact that we want $*$ to take precedence, we might write something +like + +\begin{center} +\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] +lazy val E: Parser[String, Int] = + (E ~ "+" ~ E ==> { case ((x, y), z) => x + z} | + E ~ "-" ~ E ==> { case ((x, y), z) => x - z} | + E ~ "*" ~ E ==> { case ((x, y), z) => x * z} | + "(" ~ E ~ ")" ==> { case ((x, y), z) => y} | + NumParserInt) +\end{lstlisting} +\end{center} + +\noindent +Consider again carfully how the semantic actions pick out the correct +arguments. In case of plus, we need \texttt{x} and \texttt{z}, because +they correspond to the results of the parser \texttt{E}. We can just +add \texttt{x + z} in order to obtain \texttt{Int} because the output +type of \texttt{E} is \texttt{Int}. Similarly with subtraction and +multiplication. In contrast in the fourth clause we need to return +\texttt{y}, because it is the result enclosed inside the parentheses. + +So far so good. The problem arises when we try to call \pcode{parse_all} with the +expression \texttt{"1+2+3"}. Lets try it + +\begin{center} +\begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] +E.parse_all("1+2+3") +\end{lstlisting} +\end{center} + +\noindent +\ldots and we wait and wait and \ldots wait. What is the problem? Actually, +the parser just fell into an infinite loop. The reason is that the above +grammar is left-recursive and recall that parser combinator cannot deal with +such grammars. Luckily every left-recursive context-free grammar can be +transformed into a non-left-recursive grammars that still recognise the +same strings. This allows us to design the following grammar + + + + +