# HG changeset patch # User Christian Urban # Date 1604511292 0 # Node ID 85267be9a5ed85115f38508efdd3db9d324bb500 # Parent aaf0bd0a211d9f6d57d307fe749dafeb701f4f50 updated diff -r aaf0bd0a211d -r 85267be9a5ed handouts/ho06.pdf Binary file handouts/ho06.pdf has changed diff -r aaf0bd0a211d -r 85267be9a5ed handouts/ho06.tex --- a/handouts/ho06.tex Fri Oct 30 01:45:03 2020 +0000 +++ b/handouts/ho06.tex Wed Nov 04 17:34:52 2020 +0000 @@ -4,6 +4,7 @@ \usepackage{../style} \usepackage{../langs} \usepackage{../grammar} +\usepackage{../graphics} \begin{document} @@ -244,8 +245,8 @@ \end{center} \noindent Later on we will use Scala mechanism for introducing some -more readable shorthand notation for this, like \texttt{"a" | - "b"}. Let us look in detail at what this parser combinator produces +more readable shorthand notation for this, like \texttt{p"a" || + p"b"}. Let us look in detail at what this parser combinator produces with some sample strings. \begin{center} @@ -336,7 +337,7 @@ produce the empty set. With the shorthand notation we shall introduce later for the sequence -parser combinator, we can write for example \pcode{"a" ~ "b"}, which +parser combinator, we can write for example \pcode{p"a" ~ p"b"}, which is the parser combinator that first recognises the character \texttt{a} from a string and then \texttt{b}. Let us look again at some examples of how this parser combinator processes some strings: @@ -360,7 +361,7 @@ because they do not fit with what the parser is supposed to parse. -A slightly more complicated parser is \pcode{("a" | "b") ~ "c"} which +A slightly more complicated parser is \pcode{(p"a" || p"b") ~ p"c"} which parses as first character either an \texttt{a} or \texttt{b}, followed by a \texttt{c}. This parser produces the following outputs. @@ -374,7 +375,7 @@ \end{center} \noindent -Now consider the parser \pcode{("a" ~ "b") ~ "c"} which parses +Now consider the parser \pcode{(p"a" ~ p"b") ~ p"c"} which parses \texttt{a}, \texttt{b}, \texttt{c} in sequence. This parser produces the following outputs. @@ -392,7 +393,7 @@ ``missing'' in the sequence we are looking for. The first succeeds but notice how the results nest with sequences: the parsed part is a nested pair of the form \pcode{((a, b), c)}. If we nest the sequence -parser differently, say \pcode{"a" ~ ("b" ~ "c")}, then also +parser differently, say \pcode{p"a" ~ (p"b" ~ p"c")}, then also our output pairs nest differently \begin{center} @@ -404,7 +405,7 @@ \noindent Two more examples: first consider the parser -\pcode{("a" ~ "a") ~ "a"} and the input \pcode{aaaa}: +\pcode{(p"a" ~ p"a") ~ p"a"} and the input \pcode{aaaa}: \begin{center} \begin{tabular}{rcl} @@ -416,8 +417,8 @@ \noindent Notice again how the results nest deeper and deeper as pairs (the last \pcode{a} is in the unprocessed part). To consume everything of -this string we can use the parser \pcode{(("a" ~ "a") ~ "a") ~ - "a"}. Then the output is as follows: +this string we can use the parser \pcode{((p"a" ~ p"a") ~ p"a") ~ + p"a"}. Then the output is as follows: \begin{center} \begin{tabular}{rcl} @@ -444,8 +445,8 @@ nesting of the component parsers. -Consider also carefully that constructing a parser such \pcode{"a" | - ("a" ~ "b")} will result in a typing error. The intention with this +Consider also carefully that constructing a parser such \pcode{p"a" || + (p"a" ~ p"b")} will result in a typing error. The intention with this parser is that we want to parse either an \texttt{a}, or an \texttt{a} followed by a \texttt{b}. However, the first parser has as output type a single character (recall the type of \texttt{CharParser}), but the @@ -462,7 +463,7 @@ \begin{center} \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] -class FunParser[I, T, S] +class MapParser[I, T, S] (p: => Parser[I, T], f: T => S) extends Parser[I, S] { def parse(in: I) = @@ -480,7 +481,7 @@ parser outputs. Since this function is of type \texttt{T => S}, we obtain a parser with output type \texttt{S}. Again Scala lets us introduce some shorthand notation for this parser -combinator. Therefore we will write short \texttt{p ==> f} for it. +combinator. Therefore we will write short \texttt{p.map(f)} for it. What are semantic actions good for? Well, they allow you to transform the parsed input into datastructures you can use for further @@ -503,7 +504,7 @@ \begin{center} \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] c.parse("cbd") -(c ==> f).parse("cbd") +c.map(f).parse("cbd") \end{lstlisting} \end{center} @@ -565,7 +566,7 @@ \begin{center} \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] -(NumParser ==> (s => s.toInt)).parse("123abc") +NumParser.map{s => s.toInt}.parse("123abc") \end{lstlisting} \end{center} @@ -585,14 +586,14 @@ \begin{center} \begin{tabular}{ll} - \pcode{p | q} & alternative parser\\ + \pcode{p || q} & alternative parser\\ \pcode{p ~ q} & sequence parser\\ - \pcode{p ==> f} & semantic action parser + \pcode{p.map(f)} & semantic action parser \end{tabular} \end{center} \noindent -as well as to use plain strings for specifying simple string parsers. +as well as to use string interpolations for specifying simple string parsers. The idea is that this shorthand notation allows us to easily translate context-free grammars into code. For example recall our context-free @@ -606,7 +607,7 @@ Each alternative in this grammar translates into an alternative parser combinator. The $\cdot$ can be translated to a sequence parser combinator. The parsers for $a$, $b$ and $\epsilon$ can be simply -written as \texttt{"a"}, \texttt{"b"} and \texttt{""}. +written as \texttt{p"a"}, \texttt{p"b"} and \texttt{p""}. \subsubsection*{How to build parsers using parser combinators?} @@ -620,16 +621,16 @@ \begin{center} \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] lazy val Pal : Parser[String, String] = - (("a" ~ Pal ~ "a") | ("b" ~ Pal ~ "b") | "a" | "b" | "") + ((p"a" ~ Pal ~ p"a") || (p"b" ~ Pal ~ p"b") || p"a" || p"b" || p"") \end{lstlisting} \end{center} \noindent Unfortunately, this does not quite work yet as it produces a typing -error. The reason is that the parsers \texttt{"a"}, \texttt{"b"} and -\texttt{""} all produce strings as output type and therefore can be -put into an alternative \texttt{...| "a" | "b" | ""}. But both -sequence parsers \pcode{"a" ~ Pal ~ "a"} and \pcode{"b" ~ Pal ~ "b"} +error. The reason is that the parsers \texttt{p"a"}, \texttt{p"b"} and +\texttt{p""} all produce strings as output type and therefore can be +put into an alternative \texttt{...|| p"a" || p"b" || p""}. But both +sequence parsers \pcode{p"a" ~ Pal ~ p"a"} and \pcode{p"b" ~ Pal ~ p"b"} produce pairs of the form \begin{center} @@ -644,15 +645,15 @@ \begin{center} \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] lazy val Pal : Parser[String, String] = - (("a" ~ Pal ~ "a") ==> { case ((x, y), z) => x + y + z } | - ("b" ~ Pal ~ "b") ==> { case ((x, y), z) => x + y + z } | - "a" | "b" | "") + ((p"a" ~ Pal ~ p"a").map{ case ((x, y), z) => x + y + z } || + (p"b" ~ Pal ~ p"b").map{ case ((x, y), z) => x + y + z } || + p"a" || p"b" || p"") \end{lstlisting} \end{center} \noindent How does this work? Well, recall again what the pairs look like for -the parser \pcode{"a" ~ Pal ~ "a"}. The pattern in the semantic +the parser \pcode{p"a" ~ Pal ~ p"a"}. The pattern in the semantic action matches the nested pairs (the \texttt{x} with the \texttt{a}-part and so on). Unfortunately when we have such nested pairs, Scala requires us to define the function using the @@ -670,7 +671,7 @@ The action we implement above is to concatenate all three strings, which means after the semantic action is applied the output type of the parser is \texttt{String}, which means it fits with the alternative parsers -\texttt{...| "a" | "b" | ""}. +\texttt{...|| p"a" || p"b" || p""}. If we run the parser above with \pcode{Pal.parse_all("abaaaba")} we obtain as result the \pcode{Set(abaaaba)}, which indicates that the string is a palindrome @@ -772,7 +773,7 @@ \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, xleftmargin=0mm, numbers=none] lazy val P : Parser[String, String] = - "(" ~ P ~ ")" ~ P ==> { case (((_,x),_),y) => "{" + x + "}" + y } | "" + (p"(" ~ P ~ p")" ~ P).map{ case (((_,x),_),y) => "{" + x + "}" + y } || p"" \end{lstlisting} \end{center} @@ -815,10 +816,10 @@ \begin{center} \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] lazy val E: Parser[String, Int] = - (E ~ "+" ~ E ==> { case ((x, y), z) => x + z} | - E ~ "-" ~ E ==> { case ((x, y), z) => x - z} | - E ~ "*" ~ E ==> { case ((x, y), z) => x * z} | - "(" ~ E ~ ")" ==> { case ((x, y), z) => y} | + ((E ~ p"+" ~ E).map{ case ((x, y), z) => x + z} || + (E ~ p"-" ~ E).map{ case ((x, y), z) => x - z} || + (E ~ p"*" ~ E).map{ case ((x, y), z) => x * z} || + (p"(" ~ E ~ p")").map{ case ((x, y), z) => y} || NumParserInt) \end{lstlisting} \end{center} @@ -870,12 +871,12 @@ \begin{center} \begin{lstlisting}[language=Scala,basicstyle=\small\ttfamily, numbers=none] lazy val E: Parser[String, Int] = - (T ~ "+" ~ E) ==> { case ((x, y), z) => x + z } | - (T ~ "-" ~ E) ==> { case ((x, y), z) => x - z } | T + (T ~ p"+" ~ E).map{ case ((x, y), z) => x + z } || + (T ~ p"-" ~ E).map{ case ((x, y), z) => x - z } || T lazy val T: Parser[String, Int] = - (F ~ "*" ~ T) ==> { case ((x, y), z) => x * z } | F + (F ~ p"*" ~ T).map{ case ((x, y), z) => x * z } || F lazy val F: Parser[String, Int] = - ("(" ~ E ~ ")") ==> { case ((x, y), z) => y } | NumParserInt + (p"(" ~ E ~ p")").map{ case ((x, y), z) => y } || NumParserInt \end{lstlisting} \end{center} diff -r aaf0bd0a211d -r 85267be9a5ed progs/parser-combinators/comb1.sc --- a/progs/parser-combinators/comb1.sc Fri Oct 30 01:45:03 2020 +0000 +++ b/progs/parser-combinators/comb1.sc Wed Nov 04 17:34:52 2020 +0000 @@ -22,6 +22,12 @@ // parser combinators +// alternative parser +class AltParser[I : IsSeq, T](p: => Parser[I, T], + q: => Parser[I, T]) extends Parser[I, T] { + def parse(in: I) = p.parse(in) ++ q.parse(in) +} + // sequence parser class SeqParser[I : IsSeq, T, S](p: => Parser[I, T], q: => Parser[I, S]) extends Parser[I, (T, S)] { @@ -30,12 +36,6 @@ (hd2, tl2) <- q.parse(tl1)) yield ((hd1, hd2), tl2) } -// alternative parser -class AltParser[I : IsSeq, T](p: => Parser[I, T], - q: => Parser[I, T]) extends Parser[I, T] { - def parse(in: I) = p.parse(in) ++ q.parse(in) -} - // map parser class MapParser[I : IsSeq, T, S](p: => Parser[I, T], f: T => S) extends Parser[I, S] { @@ -50,6 +50,7 @@ if (in != "" && in.head == c) Set((c, in.tail)) else Set() } + // an atomic parser for parsing strings according to a regex import scala.util.matching.Regex @@ -65,6 +66,7 @@ def StrParser(s: String) = RegexParser(Regex.quote(s).r) + // NumParserInt transforms a "string integer" into a propper Int // (needs "new" because MapParser is not a case class) @@ -80,6 +82,7 @@ def p(args: Any*) = StrParser(sc.s(args:_*)) } + // more convenient syntax for parser combinators implicit def ParserOps[I : IsSeq, T](p: Parser[I, T]) = new { def ||(q : => Parser[I, T]) = new AltParser[I, T](p, q) @@ -95,7 +98,7 @@ // with this NumParserInt can now be written more conveniently // as: -val NumParserInt2 = NumParser.map(s => s.toInt) +val NumParserInt2 = NumParser.map(_.toInt) // A parser for palindromes (just returns them as string) @@ -111,9 +114,15 @@ println("Palindrome: " + Pal.parse_all("abaaaba")) -// A parser for wellnested parentheses (transforms '(' -> '{' , ')' -> '}' ) -lazy val P : Parser[String, String] = - (p"(" ~ P ~ p")" ~ P).map{ case (((_, x), _), y) => "{" + x + "}" + y } || p"" +// A parser for wellnested parentheses +// +// P ::= ( P ) P | epsilon +// +// (transforms '(' -> '{' , ')' -> '}' ) +lazy val P : Parser[String, String] = { + (p"(" ~ P ~ p")" ~ P).map{ case (((_, x), _), y) => "{" + x + "}" + y } || + p"" +} println(P.parse_all("(((()()))())")) println(P.parse_all("(((()()))()))")) @@ -122,22 +131,13 @@ // A parser for arithmetic expressions (Terms and Factors) -lazy val E: Parser[String, Int] = +lazy val E: Parser[String, Int] = { (T ~ p"+" ~ E).map{ case ((x, _), z) => x + z } || - (T ~ p"-" ~ E).map{ case ((x, _), z) => x - z } || T -lazy val T: Parser[String, Int] = - (F ~ p"*" ~ T).map{ case ((x, _), z) => x * z } || F -lazy val F: Parser[String, Int] = - (p"(" ~ E ~ p")").map{ case ((_, y), _) => y } || NumParserInt - -/* same parser but producing a string -lazy val E: Parser[String, String] = - (T ~ "+" ~ E).map{ case x ~ y ~ z => "(" + x + ")+(" + z + ")"} || T -lazy val T: Parser[String, String] = - (F ~ "*" ~ T).map{ case x ~ y ~ z => "(" + x + ")*("+ z + ")"} || F -lazy val F: Parser[String, String] = - ("(" ~ E ~ ")").map{ case x ~ y ~ z => y } || NumParser -*/ + (T ~ p"-" ~ E).map{ case ((x, _), z) => x - z } || T } +lazy val T: Parser[String, Int] = { + (F ~ p"*" ~ T).map{ case ((x, _), z) => x * z } || F } +lazy val F: Parser[String, Int] = { + (p"(" ~ E ~ p")").map{ case ((_, y), _) => y } || NumParserInt } println(E.parse_all("1+3+4")) println(E.parse("1+3+4")) @@ -147,7 +147,7 @@ println(E.parse_all("4/2+3")) println(E.parse("1 + 2 * 3")) println(E.parse_all("(1+2)+3")) -println(E.parse_all("1+2+3")) +println(E.parse_all("1+2+3")) // with parser combinators (and other parsing algorithms) @@ -194,7 +194,7 @@ // A variant which counts how many 1s are parsed lazy val UCount : Parser[String, Int] = - (p"1" ~ UCount).map[Int]{ case (_, y) => y + 1 } || p"".map[Int]{ _ => 0 } + (p"1" ~ UCount).map{ case (_, y) => y + 1 } || p"".map{ _ => 0 } println(UCount.parse("11111")) println(UCount.parse_all("11111")) diff -r aaf0bd0a211d -r 85267be9a5ed slides/slides05.pdf Binary file slides/slides05.pdf has changed diff -r aaf0bd0a211d -r 85267be9a5ed slides/slides05.tex --- a/slides/slides05.tex Fri Oct 30 01:45:03 2020 +0000 +++ b/slides/slides05.tex Wed Nov 04 17:34:52 2020 +0000 @@ -572,6 +572,25 @@ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c,fragile] + \begin{mybox3}{}\it + "The C++ grammar is ambiguous, context-dependent and potentially + requires infinite lookahead to resolve some ambiguities." + \end{mybox3}\bigskip + + + \hfill from the \href{http://www.computing.surrey.ac.uk/research/dsrg/fog/FogThesis.pdf}{PhD thesis} by Willink (2001) + + \small + \begin{center} + \begin{lstlisting}[language={},numbers=none] + int(x), y, *const z; + int(x), y, new int; + \end{lstlisting} + \end{center} +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame}[c] @@ -588,435 +607,11 @@ \begin{center} \bl{$\meta{S} \rightarrow\ldots\rightarrow^? ababaa$} -\end{center}\pause - -\begin{center} - \tt Time flies like an arrow;\\ - fruit flies like bananas. - \end{center} +\end{center} \end{frame} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] -\frametitle{Parser Combinators} - -One of the simplest ways to implement a parser, see -{\small\url{https://vimeo.com/142341803}}\bigskip - -Parser combinators: \bigskip - -\begin{minipage}{1.1\textwidth} -\begin{center} -\mbox{}\hspace{-12mm}\mbox{}$\underbrace{\text{list of tokens}}_{\text{input}}$ \bl{$\Rightarrow$} -$\underbrace{\text{set of (parsed input, unparsed input)}}_{\text{output}}$ -\end{center} -\end{minipage}\bigskip - -\begin{itemize} -\item atomic parsers -\item sequencing -\item alternative -\item semantic action -\end{itemize} - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] - -Atomic parsers, for example, number tokens - -\begin{center} -\bl{$\texttt{Num(123)}::rest \;\Rightarrow\; \{(\texttt{Num(123)}, rest)\}$} -\end{center}\bigskip - -\begin{itemize} -\item you consume one or more token from the\\ - input (stream) -\item also works for characters and strings -\end{itemize} -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] - -Alternative parser (code \bl{$p\;||\;q$})\bigskip - -\begin{itemize} -\item apply \bl{$p$} and also \bl{$q$}; then combine - the outputs -\end{itemize} - -\begin{center} -\large \bl{$p(\text{input}) \cup q(\text{input})$} -\end{center} - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] - -Sequence parser (code \bl{$p\sim q$})\bigskip - -\begin{itemize} -\item apply first \bl{$p$} producing a set of pairs -\item then apply \bl{$q$} to the unparsed part -\item then combine the results:\medskip -\begin{center} -((output$_1$, output$_2$), unparsed part) -\end{center} -\end{itemize} - -\begin{center} -\begin{tabular}{l} -\large \bl{$\{((o_1, o_2), u_2) \;|\;$}\\[2mm] -\large\mbox{}\hspace{15mm} \bl{$(o_1, u_1) \in p(\text{input}) \wedge$}\\[2mm] -\large\mbox{}\hspace{15mm} \bl{$(o_2, u_2) \in q(u_1)\}$} -\end{tabular} -\end{center} - - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] - -Function parser (code \bl{$p \Rightarrow f\;$})\bigskip - -\begin{itemize} -\item apply \bl{$p$} producing a set of pairs -\item then apply the function \bl{$f$} to each first component -\end{itemize} - -\begin{center} -\begin{tabular}{l} -\large \bl{$\{(f(o_1), u_1) \;|\; (o_1, u_1) \in p(\text{input})\}$} -\end{tabular} -\end{center}\bigskip\bigskip\pause - -\bl{$f$} is the semantic action (``what to do with the parsed input'') - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\mode{ -\begin{frame}[c] -\frametitle{\begin{tabular}{c}Semantic Actions\end{tabular}} - -Addition - -\begin{center} -\bl{$\meta{T} \sim + \sim \meta{E} \Rightarrow \underbrace{f\,((x,y), z) \Rightarrow x + z}_{\text{semantic action}}$} -\end{center}\pause - -Multiplication - -\begin{center} -\bl{$\meta{F} \sim * \sim \meta{T} \Rightarrow f\,((x,y), z) \Rightarrow x * z$} -\end{center}\pause - -Parenthesis - -\begin{center} -\bl{$\text{(} \sim \meta{E} \sim \text{)} \Rightarrow f\,((x,y), z) \Rightarrow y$} -\end{center} - -\end{frame}} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] -\frametitle{Types of Parsers} - -\begin{itemize} -\item {\bf Sequencing}: if \bl{$p$} returns results of type \bl{$T$}, and \bl{$q$} results of type \bl{$S$}, -then \bl{$p \sim q$} returns results of type - -\begin{center} -\bl{$T \times S$} -\end{center}\pause - -\item {\bf Alternative}: if \bl{$p$} returns results of type \bl{$T$} then \bl{$q$} \alert{must} also have results of type \bl{$T$}, -and \bl{$p \;||\; q$} returns results of type - -\begin{center} -\bl{$T$} -\end{center}\pause - -\item {\bf Semantic Action}: if \bl{$p$} returns results of type \bl{$T$} and \bl{$f$} is a function from -\bl{$T$} to \bl{$S$}, then -\bl{$p \Rightarrow f$} returns results of type - -\begin{center} -\bl{$S$} -\end{center} - -\end{itemize} - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] -\frametitle{Input Types of Parsers} - -\begin{itemize} -\item input: \alert{token list} -\item output: set of (output\_type, \alert{token list}) -\end{itemize}\bigskip\pause - -actually it can be any input type as long as it is a kind of -sequence (for example a string) - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] -\frametitle{Scannerless Parsers} - -\begin{itemize} -\item input: \alert{string} -\item output: set of (output\_type, \alert{string}) -\end{itemize}\bigskip\bigskip - -but using lexers is better because whitespaces or comments can be -filtered out; then input is a sequence of tokens - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] -\frametitle{Successful Parses} - -\begin{itemize} -\item input: string -\item output: \alert{set of} (output\_type, string) -\end{itemize}\bigskip - -a parse is successful whenever the input has been fully -``consumed'' (that is the second component is empty) - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] -\frametitle{Abstract Parser Class} - -\small -\lstinputlisting[language=Scala,xleftmargin=1mm] - {../progs/app7.scala} -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] - -\small -\fontsize{10}{12}\selectfont -\lstinputlisting[language=Scala,xleftmargin=1mm] - {../progs/app8.scala} -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] -\frametitle{Two Grammars} - -Which languages are recognised by the following two grammars? - -\begin{center} -\bl{\begin{tabular}{lcl} -$\meta{S}$ & $\rightarrow$ & $1 \cdot \meta{S} \cdot \meta{S}$\\ - & $|$ & $\epsilon$ -\end{tabular}} -\end{center}\bigskip - -\begin{center} -\bl{\begin{tabular}{lcl} -$\meta{U}$ & $\rightarrow$ & $1 \cdot \meta{U}$\\ - & $|$ & $\epsilon$ -\end{tabular}} -\end{center} - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[t] -\frametitle{Ambiguous Grammars} - -\begin{center} -\begin{tikzpicture} -\begin{axis}[xlabel={\pcode{1}s},ylabel={time in secs}, - enlargelimits=false, - xtick={0,100,...,1000}, - xmax=1050, - ymax=33, - ytick={0,5,...,30}, - scaled ticks=false, - axis lines=left, - width=11cm, - height=7cm, - legend entries={unambiguous,ambiguous}, - legend pos=north east, - legend cell align=left, - x tick label style={font=\small,/pgf/number format/1000 sep={}}] -\addplot[blue,mark=*, mark options={fill=white}] - table {s-grammar1.data}; -\only<2>{ - \addplot[red,mark=triangle*, mark options={fill=white}] - table {s-grammar2.data};} -\end{axis} -\end{tikzpicture} -\end{center} - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame} -\frametitle{While-Language} -\mbox{}\\[-23mm]\mbox{} - -\bl{\begin{plstx}[rhs style=,one per line]: \meta{Stmt} ::= skip - | \meta{Id} := \meta{AExp} - | if \meta{BExp} then \meta{Block} else \meta{Block} - | while \meta{BExp} do \meta{Block}\\ -: \meta{Stmts} ::= \meta{Stmt} ; \meta{Stmts} - | \meta{Stmt}\\ -: \meta{Block} ::= \{ \meta{Stmts} \} - | \meta{Stmt}\\ -: \meta{AExp} ::= \ldots\\ -: \meta{BExp} ::= \ldots\\\end{plstx}} - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] -\frametitle{An Interpreter} - -\begin{center} -\bl{\begin{tabular}{l} -$\{$\\ -\;\;$x := 5 \text{;}$\\ -\;\;$y := x * 3\text{;}$\\ -\;\;$y := x * 4\text{;}$\\ -\;\;$x := u * 3$\\ -$\}$ -\end{tabular}} -\end{center} - -\begin{itemize} -\item the interpreter has to record the value of \bl{$x$} before assigning a value to \bl{$y$}\pause -\item \bl{\texttt{eval(stmt, env)}} -\end{itemize} - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\mode{ -\begin{frame}[c] -\frametitle{\begin{tabular}{c}Interpreter\end{tabular}} - -\begin{center} -\bl{\begin{tabular}{@{}lcl@{}} -$\text{eval}(n, E)$ & $\dn$ & $n$\\ -$\text{eval}(x, E)$ & $\dn$ & $E(x)$ \;\;\;\textcolor{black}{lookup \bl{$x$} in \bl{$E$}}\\ -$\text{eval}(a_1 + a_2, E)$ & $\dn$ & $\text{eval}(a_1, E) + \text{eval}(a_2, E)$\\ -$\text{eval}(a_1 - a_2, E)$ & $\dn$ & $\text{eval}(a_1, E) - \text{eval}(a_2, E)$\\ -$\text{eval}(a_1 * a_2, E)$ & $\dn$ & $\text{eval}(a_1, E) * \text{eval}(a_2, E)$\bigskip\\ -$\text{eval}(a_1 = a_2, E)$ & $\dn$ & $\text{eval}(a_1, E) = \text{eval}(a_2, E)$\\ -$\text{eval}(a_1\,!\!= a_2, E)$ & $\dn$ & $\neg(\text{eval}(a_1, E) = \text{eval}(a_2, E))$\\ -$\text{eval}(a_1 < a_2, E)$ & $\dn$ & $\text{eval}(a_1, E) < \text{eval}(a_2, E)$\ -\end{tabular}} -\end{center} - -\end{frame}} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\mode{ -\begin{frame}[c] -\frametitle{\begin{tabular}{c}Interpreter (2)\end{tabular}} - -\begin{center} -\bl{\begin{tabular}{@{}lcl@{}} -$\text{eval}(\text{skip}, E)$ & $\dn$ & $E$\\ -$\text{eval}(x:=a, E)$ & $\dn$ & \bl{$E(x \mapsto \text{eval}(a, E))$}\\ -\multicolumn{3}{@{}l@{}}{$\text{eval}(\text{if}\;b\;\text{then}\;cs_1\;\text{else}\;cs_2 , E) \dn$}\\ -\multicolumn{3}{@{}l@{}}{\hspace{2cm}$\text{if}\;\text{eval}(b,E)\;\text{then}\; -\text{eval}(cs_1,E)$}\\ -\multicolumn{3}{@{}l@{}}{\hspace{2cm}$\phantom{\text{if}\;\text{eval}(b,E)\;}\text{else}\;\text{eval}(cs_2,E)$}\\ -\multicolumn{3}{@{}l@{}}{$\text{eval}(\text{while}\;b\;\text{do}\;cs, E) \dn$}\\ -\multicolumn{3}{@{}l@{}}{\hspace{2cm}$\text{if}\;\text{eval}(b,E)$}\\ -\multicolumn{3}{@{}l@{}}{\hspace{2cm}$\text{then}\; -\text{eval}(\text{while}\;b\;\text{do}\;cs, \text{eval}(cs,E))$}\\ -\multicolumn{3}{@{}l@{}}{\hspace{2cm}$\text{else}\; E$}\\ -$\text{eval}(\text{write}\; x, E)$ & $\dn$ & $\{\;\text{println}(E(x))\; ;\;E\;\}$\\ -\end{tabular}} -\end{center} - -\end{frame}} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\begin{frame}[c] -\frametitle{Test Program} - -\mbox{}\\[-18mm]\mbox{} - -??%{\lstset{language=While}%%\fontsize{10}{12}\selectfont -%\texttt{\lstinputlisting{../progs/loops.while}}} - -\end{frame} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\mode{ -\begin{frame}[t] -\frametitle{\begin{tabular}{c}Interpreted Code\end{tabular}} - -\begin{center} -\begin{tikzpicture} -\begin{axis}[axis x line=bottom, axis y line=left, xlabel=n, ylabel=secs, legend style=small] -\addplot+[smooth] file {interpreted.data}; -\end{axis} -\end{tikzpicture} -\end{center} - -\end{frame}} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\mode{ -\begin{frame}[c] -\frametitle{\begin{tabular}{c}Java Virtual Machine\end{tabular}} - -\begin{itemize} -\item introduced in 1995 -\item is a stack-based VM (like Postscript, CLR of .Net) -\item contains a JIT compiler -\item many languages take advantage of JVM's infrastructure (JRE) -\item is garbage collected $\Rightarrow$ no buffer overflows -\item some languages compile to the JVM: Scala, Clojure\ldots -\end{itemize} - -\end{frame}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{frame}[t,fragile] diff -r aaf0bd0a211d -r 85267be9a5ed slides/slides06.pdf Binary file slides/slides06.pdf has changed diff -r aaf0bd0a211d -r 85267be9a5ed slides/slides06.tex --- a/slides/slides06.tex Fri Oct 30 01:45:03 2020 +0000 +++ b/slides/slides06.tex Wed Nov 04 17:34:52 2020 +0000 @@ -55,6 +55,479 @@ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] +\frametitle{Parser Combinators} + +One of the simplest ways to implement a parser, see +{\small\url{https://vimeo.com/142341803}} (by Haoyi Li)\bigskip + +\begin{itemize} +\item[$\bullet$] build-in library in Scala +\item[$\bullet$] fastparse (2) library by Haoyi Li; is part of Ammonite +\item[$\bullet$] possible exponential runtime behaviour +\end{itemize} + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] +\frametitle{Parser Combinators} + +Parser combinators: \bigskip + +\begin{minipage}{1.1\textwidth} +\begin{center} +\mbox{}\hspace{-12mm}\mbox{}$\underbrace{\text{list of tokens}}_{\text{input}}$ \bl{$\Rightarrow$} +$\underbrace{\text{set of (parsed input, unparsed input)}}_{\text{output}}$ +\end{center} +\end{minipage}\bigskip + +\begin{itemize} +\item atomic parsers +\item sequencing +\item alternative +\item semantic action (map-parser) +\end{itemize} + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] + +Atomic parsers, for example, number tokens + +\begin{center} +\bl{$\texttt{Num(123)}::rest \;\Rightarrow\; \{(\texttt{Num(123)}, rest)\}$} +\end{center}\bigskip + +\begin{itemize} +\item you consume one or more token from the\\ + input (stream) +\item also works for characters and strings +\end{itemize} +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] + +Alternative parser (code \bl{$p\;||\;q$})\bigskip + +\begin{itemize} +\item apply \bl{$p$} and also \bl{$q$}; then combine + the outputs +\end{itemize} + +\begin{center} +\large \bl{$p(\text{input}) \cup q(\text{input})$} +\end{center} + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] + +Sequence parser (code \bl{$p\sim q$})\bigskip + +\begin{itemize} +\item apply first \bl{$p$} producing a set of pairs +\item then apply \bl{$q$} to the unparsed part +\item then combine the results:\medskip +\begin{center} +((output$_1$, output$_2$), unparsed part) +\end{center} +\end{itemize} + +\begin{center} +\begin{tabular}{l} +\large \bl{$\{((o_1, o_2), u_2) \;|\;$}\\[2mm] +\large\mbox{}\hspace{15mm} \bl{$(o_1, u_1) \in p(\text{input}) \wedge$}\\[2mm] +\large\mbox{}\hspace{15mm} \bl{$(o_2, u_2) \in q(u_1)\}$} +\end{tabular} +\end{center} + + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] + +Map-parser (code \bl{$p.map(f)\;$})\bigskip + +\begin{itemize} +\item apply \bl{$p$} producing a set of pairs +\item then apply the function \bl{$f$} to each first component +\end{itemize} + +\begin{center} +\begin{tabular}{l} +\large \bl{$\{(f(o_1), u_1) \;|\; (o_1, u_1) \in p(\text{input})\}$} +\end{tabular} +\end{center}\bigskip\bigskip\pause + +\bl{$f$} is the semantic action (``what to do with the parsed input'') + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\mode{ +\begin{frame}[c] +\frametitle{\begin{tabular}{c}Semantic Actions\end{tabular}} + +Addition + +\begin{center} +\bl{$\meta{T} \sim + \sim \meta{E} \Rightarrow \underbrace{f\,((x,y), z) \Rightarrow x + z}_{\text{semantic action}}$} +\end{center}\pause + +Multiplication + +\begin{center} +\bl{$\meta{F} \sim * \sim \meta{T} \Rightarrow f\,((x,y), z) \Rightarrow x * z$} +\end{center}\pause + +Parenthesis + +\begin{center} +\bl{$\text{(} \sim \meta{E} \sim \text{)} \Rightarrow f\,((x,y), z) \Rightarrow y$} +\end{center} + +\end{frame}} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] +\frametitle{Types of Parsers} + +\begin{itemize} +\item {\bf Sequencing}: if \bl{$p$} returns results of type \bl{$T$}, and \bl{$q$} results of type \bl{$S$}, +then \bl{$p \sim q$} returns results of type + +\begin{center} +\bl{$T \times S$} +\end{center}\pause + +\item {\bf Alternative}: if \bl{$p$} returns results of type \bl{$T$} then \bl{$q$} \alert{must} also have results of type \bl{$T$}, +and \bl{$p \;||\; q$} returns results of type + +\begin{center} +\bl{$T$} +\end{center}\pause + +\item {\bf Semantic Action}: if \bl{$p$} returns results of type \bl{$T$} and \bl{$f$} is a function from +\bl{$T$} to \bl{$S$}, then +\bl{$p \Rightarrow f$} returns results of type + +\begin{center} +\bl{$S$} +\end{center} + +\end{itemize} + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] +\frametitle{Input Types of Parsers} + +\begin{itemize} +\item input: \alert{token list} +\item output: set of (output\_type, \alert{token list}) +\end{itemize}\bigskip\pause + +actually it can be any input type as long as it is a kind of +sequence (for example a string) + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] +\frametitle{Scannerless Parsers} + +\begin{itemize} +\item input: \alert{string} +\item output: set of (output\_type, \alert{string}) +\end{itemize}\bigskip\bigskip + +but using lexers is better because whitespaces or comments can be +filtered out; then input is a sequence of tokens + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] +\frametitle{Successful Parses} + +\begin{itemize} +\item input: string +\item output: \alert{set of} (output\_type, string) +\end{itemize}\bigskip + +a parse is successful whenever the input has been fully +``consumed'' (that is the second component is empty) + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] +\frametitle{Abstract Parser Class} + +\small +\lstinputlisting[language=Scala,xleftmargin=1mm] + {../progs/app7.scala} +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] + +\small +\fontsize{10}{12}\selectfont +\lstinputlisting[language=Scala,xleftmargin=1mm] + {../progs/app8.scala} +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] +\frametitle{Two Grammars} + +Which languages are recognised by the following two grammars? + +\begin{center} +\bl{\begin{tabular}{lcl} +$\meta{S}$ & $\rightarrow$ & $1 \cdot \meta{S} \cdot \meta{S}$\\ + & $|$ & $\epsilon$ +\end{tabular}} +\end{center}\bigskip + +\begin{center} +\bl{\begin{tabular}{lcl} +$\meta{U}$ & $\rightarrow$ & $1 \cdot \meta{U}$\\ + & $|$ & $\epsilon$ +\end{tabular}} +\end{center} + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[t] +\frametitle{Ambiguous Grammars} + +\begin{center} +\begin{tikzpicture} +\begin{axis}[xlabel={\pcode{1}s},ylabel={time in secs}, + enlargelimits=false, + xtick={0,100,...,1000}, + xmax=1050, + ymax=33, + ytick={0,5,...,30}, + scaled ticks=false, + axis lines=left, + width=11cm, + height=7cm, + legend entries={unambiguous,ambiguous}, + legend pos=north east, + legend cell align=left, + x tick label style={font=\small,/pgf/number format/1000 sep={}}] +\addplot[blue,mark=*, mark options={fill=white}] + table {s-grammar1.data}; +\only<2>{ + \addplot[red,mark=triangle*, mark options={fill=white}] + table {s-grammar2.data};} +\end{axis} +\end{tikzpicture} +\end{center} + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame} +\frametitle{While-Language} +\mbox{}\\[-23mm]\mbox{} + +\bl{\begin{plstx}[rhs style=,one per line]: \meta{Stmt} ::= skip + | \meta{Id} := \meta{AExp} + | if \meta{BExp} then \meta{Block} else \meta{Block} + | while \meta{BExp} do \meta{Block}\\ +: \meta{Stmts} ::= \meta{Stmt} ; \meta{Stmts} + | \meta{Stmt}\\ +: \meta{Block} ::= \{ \meta{Stmts} \} + | \meta{Stmt}\\ +: \meta{AExp} ::= \ldots\\ +: \meta{BExp} ::= \ldots\\\end{plstx}} + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] +\frametitle{An Interpreter} + +\begin{center} +\bl{\begin{tabular}{l} +$\{$\\ +\;\;$x := 5 \text{;}$\\ +\;\;$y := x * 3\text{;}$\\ +\;\;$y := x * 4\text{;}$\\ +\;\;$x := u * 3$\\ +$\}$ +\end{tabular}} +\end{center} + +\begin{itemize} +\item the interpreter has to record the value of \bl{$x$} before assigning a value to \bl{$y$}\pause +\item \bl{\texttt{eval(stmt, env)}} +\end{itemize} + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\mode{ +\begin{frame}[c] +\frametitle{\begin{tabular}{c}Interpreter\end{tabular}} + +\begin{center} +\bl{\begin{tabular}{@{}lcl@{}} +$\text{eval}(n, E)$ & $\dn$ & $n$\\ +$\text{eval}(x, E)$ & $\dn$ & $E(x)$ \;\;\;\textcolor{black}{lookup \bl{$x$} in \bl{$E$}}\\ +$\text{eval}(a_1 + a_2, E)$ & $\dn$ & $\text{eval}(a_1, E) + \text{eval}(a_2, E)$\\ +$\text{eval}(a_1 - a_2, E)$ & $\dn$ & $\text{eval}(a_1, E) - \text{eval}(a_2, E)$\\ +$\text{eval}(a_1 * a_2, E)$ & $\dn$ & $\text{eval}(a_1, E) * \text{eval}(a_2, E)$\bigskip\\ +$\text{eval}(a_1 = a_2, E)$ & $\dn$ & $\text{eval}(a_1, E) = \text{eval}(a_2, E)$\\ +$\text{eval}(a_1\,!\!= a_2, E)$ & $\dn$ & $\neg(\text{eval}(a_1, E) = \text{eval}(a_2, E))$\\ +$\text{eval}(a_1 < a_2, E)$ & $\dn$ & $\text{eval}(a_1, E) < \text{eval}(a_2, E)$\ +\end{tabular}} +\end{center} + +\end{frame}} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\mode{ +\begin{frame}[c] +\frametitle{\begin{tabular}{c}Interpreter (2)\end{tabular}} + +\begin{center} +\bl{\begin{tabular}{@{}lcl@{}} +$\text{eval}(\text{skip}, E)$ & $\dn$ & $E$\\ +$\text{eval}(x:=a, E)$ & $\dn$ & \bl{$E(x \mapsto \text{eval}(a, E))$}\\ +\multicolumn{3}{@{}l@{}}{$\text{eval}(\text{if}\;b\;\text{then}\;cs_1\;\text{else}\;cs_2 , E) \dn$}\\ +\multicolumn{3}{@{}l@{}}{\hspace{2cm}$\text{if}\;\text{eval}(b,E)\;\text{then}\; +\text{eval}(cs_1,E)$}\\ +\multicolumn{3}{@{}l@{}}{\hspace{2cm}$\phantom{\text{if}\;\text{eval}(b,E)\;}\text{else}\;\text{eval}(cs_2,E)$}\\ +\multicolumn{3}{@{}l@{}}{$\text{eval}(\text{while}\;b\;\text{do}\;cs, E) \dn$}\\ +\multicolumn{3}{@{}l@{}}{\hspace{2cm}$\text{if}\;\text{eval}(b,E)$}\\ +\multicolumn{3}{@{}l@{}}{\hspace{2cm}$\text{then}\; +\text{eval}(\text{while}\;b\;\text{do}\;cs, \text{eval}(cs,E))$}\\ +\multicolumn{3}{@{}l@{}}{\hspace{2cm}$\text{else}\; E$}\\ +$\text{eval}(\text{write}\; x, E)$ & $\dn$ & $\{\;\text{println}(E(x))\; ;\;E\;\}$\\ +\end{tabular}} +\end{center} + +\end{frame}} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[c] +\frametitle{Test Program} + +\mbox{}\\[-18mm]\mbox{} + +??%{\lstset{language=While}%%\fontsize{10}{12}\selectfont +%\texttt{\lstinputlisting{../progs/loops.while}}} + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\mode{ +\begin{frame}[t] +\frametitle{\begin{tabular}{c}Interpreted Code\end{tabular}} + +\begin{center} +\begin{tikzpicture} +\begin{axis}[axis x line=bottom, axis y line=left, xlabel=n, ylabel=secs, legend style=small] +\addplot+[smooth] file {interpreted.data}; +\end{axis} +\end{tikzpicture} +\end{center} + +\end{frame}} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\mode{ +\begin{frame}[c] +\frametitle{\begin{tabular}{c}Java Virtual Machine\end{tabular}} + +\begin{itemize} +\item introduced in 1995 +\item is a stack-based VM (like Postscript, CLR of .Net) +\item contains a JIT compiler +\item many languages take advantage of JVM's infrastructure (JRE) +\item is garbage collected $\Rightarrow$ no buffer overflows +\item some languages compile to the JVM: Scala, Clojure\ldots +\end{itemize} + +\end{frame}} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\begin{frame}[t] +\frametitle{% + \begin{tabular}{@ {}c@ {}} + \\[-3mm] + \LARGE Compilers and \\[-2mm] + \LARGE Formal Languages\\[3mm] + \end{tabular}} + + \normalsize + \begin{center} + \begin{tabular}{ll} + Email: & christian.urban at kcl.ac.uk\\ + %Office Hours: & Thursdays 12 -- 14\\ + %Location: & N7.07 (North Wing, Bush House)\\ + Slides \& Progs: & KEATS (also homework is there)\\ + \end{tabular} + \end{center} + + \begin{center} + \begin{tikzpicture} + \node[drop shadow,fill=white,inner sep=0pt] + {\footnotesize\rowcolors{1}{capri!10}{white} + \begin{tabular}{|p{4.8cm}|p{4.8cm}|}\hline + 1 Introduction, Languages & \cellcolor{blue!50} 6 While-Language \\ + 2 Regular Expressions, Derivatives & 7 Compilation, JVM \\ + 3 Automata, Regular Languages & 8 Compiling Functional Languages \\ + 4 Lexing, Tokenising & 9 Optimisations \\ + 5 Grammars, Parsing & 10 LLVM \\ \hline + \end{tabular}% + }; + \end{tikzpicture} + \end{center} + +\end{frame} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % \begin{frame}[c] % \small % \mbox{}\\[5mm]