# HG changeset patch # User Christian Urban # Date 1540639023 -3600 # Node ID d40d7d7b85bc575639a2effca6a5c951d9d4c609 # Parent bb24d4e207b6b03a5afcda181a615ddaf7acf85d updated diff -r bb24d4e207b6 -r d40d7d7b85bc handouts/ho06.pdf Binary file handouts/ho06.pdf has changed diff -r bb24d4e207b6 -r d40d7d7b85bc handouts/ho06.tex --- a/handouts/ho06.tex Fri Oct 26 17:13:41 2018 +0100 +++ b/handouts/ho06.tex Sat Oct 27 12:17:03 2018 +0100 @@ -27,7 +27,7 @@ \begin{center} $\underbrace{\text{list of tokens}}_{\text{input}}$ -$\Rightarrow$ +$\quad\Rightarrow\quad$ $\underbrace{\text{set of (parsed part, unprocessed part)}}_{\text{output}}$ \end{center} @@ -82,20 +82,21 @@ something ``went wrong''\ldots or more precisely, nothing could be parsed. -Also important to note is that the type \texttt{T} for the processed -part is different from the input type \texttt{I} in the parse. In the -example above is just happens to be the same. The reason for the -difference is that in general we are interested in -transforming our input into something ``different''\ldots for example -into a tree; or if we implement the grammar for arithmetic -expressions, we might be interested in the actual integer number the -arithmetic expression, say \texttt{1 + 2 * 3}, stands for. In this way -we can use parser combinators to implement relatively easily a -calculator, for instance. +Also important to note is that the output type \texttt{T} for the +processed part can potentially be different from the input type +\texttt{I} in the parser. In the example above is just happens to be +the same. The reason for the difference is that in general we are +interested in transforming our input into something +``different''\ldots for example into a tree; or if we implement the +grammar for arithmetic expressions, we might be interested in the +actual integer number the arithmetic expression, say \texttt{1 + 2 * + 3}, stands for. In this way we can use parser combinators to +implement relatively easily a calculator, for instance (we shall do +this later on). -The main idea of parser combinators is that we can easily build parser -combinators out of smaller components following very closely the -structure of a grammar. In order to implement this in a +The main driving force behind parser combinators is that we can easily +build parser combinators out of smaller components following very +closely the structure of a grammar. In order to implement this in a functional/object-oriented programming language, like Scala, we need to specify an abstract class for parser combinators. In the abstract class we specify that \texttt{I} is the \emph{input type} of the @@ -878,7 +879,7 @@ \end{center} \noindent -Let us try out on some examples: +Let us try out some examples: \begin{center} \begin{tabular}{rcl} @@ -886,18 +887,24 @@ \texttt{\Grid{1+2+3}} & $\rightarrow$ & \texttt{Set(6)}\\ \texttt{\Grid{4*2+3}} & $\rightarrow$ & \texttt{Set(11)}\\ \texttt{\Grid{4*(2+3)}} & $\rightarrow$ & \texttt{Set(20)}\\ + \texttt{\Grid{(4)*((2+3))}} & $\rightarrow$ & \texttt{Set(20)}\\ \texttt{\Grid{4/2+3}} & $\rightarrow$ & \texttt{Set()}\\ \texttt{\Grid{1\VS +\VS 2\VS +\VS 3}} & $\rightarrow$ & \texttt{Set()}\\ \end{tabular} \end{center} \noindent -All examples should be quite self-explanatory: the last two do not -produce any result because our parser did not define what to do in -case of division (could be easily added) but also has no idea what to -do with whitescpaces. To deal with them is the task of the lexer. We -can deal with them inside the grammar, but that would render many -grammars becoming unintelligible. +Note that we call \pcode{parse_all}, not \pcode{parse}. The examples +should be quite self-explanatory. The last two example do not produce +any integer result because our parser does not define what to do in +case of division (could be easily added), but also has no idea what to +do with whitescpaces. To deal with them is the task of the lexer! Yes, +we can deal with them inside the grammar, but that would render many +grammars becoming unintelligible, including this one.\footnote{If you + think an easy solution is to extend the notion of what a number + should be, then think again---you still would have to deal with + cases like \texttt{\Grid{(\VS (\VS 2+3)\VS )}}. Jusat think you have + a grammar for a full-blown language where there are numerous such cases.} \end{document} diff -r bb24d4e207b6 -r d40d7d7b85bc progs/comb1.scala --- a/progs/comb1.scala Fri Oct 26 17:13:41 2018 +0100 +++ b/progs/comb1.scala Sat Oct 27 12:17:03 2018 +0100 @@ -102,6 +102,7 @@ println(E.parse_all("4*2+3")) println(E.parse_all("4*(2+3)")) +println(E.parse_all("(4)*((2+3))")) println(E.parse_all("4/2+3")) println(E.parse("1 + 2 * 3")) println(E.parse_all("(1+2)+3")) @@ -119,18 +120,6 @@ //println(EL.parse_all("1+2+3")) -// a repetition parser - -def RepParser[I <% Seq[_], T](p: => Parser[I, T]): Parser[I, List[T]] = { - p ==> { case x => x :: Nil } | - p ~ RepParser(p) ==> { case (x, y) => x :: y } -} - - -// a repetition parser -lazy val R : Parser[String, List[Char]] = RepParser('a') -println(R.parse_all("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")) - // non-ambiguous vs ambiguous grammars