26 Observe the maximal munch rule and priorities of your regular |
27 Observe the maximal munch rule and priorities of your regular |
27 expressions that make the process of lexing unambiguous.) |
28 expressions that make the process of lexing unambiguous.) |
28 |
29 |
29 \item Suppose the grammar |
30 \item Suppose the grammar |
30 |
31 |
31 \begin{center} |
32 \begin{plstx}[margin=1cm] |
32 \begin{tabular}{lcl} |
33 :\meta{E} ::= \meta{F} \;|\; \meta{F} \cdot \texttt{*} \cdot \meta{F} \;|\; \meta{F} \cdot \backslash \cdot \meta{F}\\ |
33 $E$ & $\rightarrow$ & $F \;|\; F \cdot * \cdot F \;|\; F \cdot \backslash \cdot F$\\ |
34 :\meta{F} ::= \meta{T} \;|\; \meta{T} \cdot \texttt{+} \cdot \meta{T} \;|\; \meta{T} \cdot \texttt{-} \cdot \meta{T}\\ |
34 $F$ & $\rightarrow$ & $T \;|\; T \cdot \texttt{+} \cdot T \;|\; T \cdot \texttt{-} \cdot T$\\ |
35 :\meta{T} ::= $num$ \;|\; \texttt{(} \cdot E \cdot \texttt{)}\\ |
35 $T$ & $\rightarrow$ & $num \;|\; \texttt{(} \cdot E \cdot \texttt{)}$\\ |
36 \end{plstx} |
36 \end{tabular} |
|
37 \end{center} |
|
38 |
37 |
39 where $E$, $F$ and $T$ are non-terminals, $E$ is the starting symbol of the grammar, and $num$ stands for |
38 where \meta{E}, \meta{F} and \meta{T} are non-terminals, \meta{E} is |
40 a number token. Give a parse tree for the string \texttt{(3+3)+(2*3)}. |
39 the starting symbol of the grammar, and $num$ stands for a number |
|
40 token. Give a parse tree for the string \texttt{(3+3)+(2*3)}. Note that |
|
41 \meta{F} and \meta{T} are ``exchanged'' in this grammar in comparison with |
|
42 the usual grammar for arithmetic expressions. What does this mean in terms |
|
43 of precedences of the operators? |
|
44 |
|
45 \solution{ |
|
46 For the second part "1+2*3" will be parsed as (1+2)*3, meaning + and - bind |
|
47 tighter than * and $\backslash$ |
|
48 } |
41 |
49 |
42 \item Define what it means for a grammar to be ambiguous. Give an example of |
50 \item Define what it means for a grammar to be ambiguous. Give an example of |
43 an ambiguous grammar. |
51 an ambiguous grammar. |
|
52 |
|
53 \solution{ |
|
54 Already the grammar |
|
55 \begin{plstx}[margin=1cm] |
|
56 : \meta{E} ::= \meta{E}\cdot +\cdot\meta{E} | num\\ |
|
57 \end{plstx} |
|
58 |
|
59 is ambiguous because a string like "1+2+3" can be parsed |
|
60 as (1+2)+3 or 1+(2+3). |
|
61 } |
|
62 |
44 |
63 |
45 \item Suppose boolean expressions are built up from |
64 \item Suppose boolean expressions are built up from |
46 |
65 |
47 \begin{center} |
66 \begin{center} |
48 \begin{tabular}{ll} |
67 \begin{tabular}{ll} |
59 |
78 |
60 |
79 |
61 |
80 |
62 \item Parsing combinators consist of atomic parsers, alternative |
81 \item Parsing combinators consist of atomic parsers, alternative |
63 parsers, sequence parsers and semantic actions. What is the purpose |
82 parsers, sequence parsers and semantic actions. What is the purpose |
64 of (1) atomic parsers and of (2) semantic actions? |
83 of (1) atomic parsers and of (2) map-parsers (also called semantic actions)? |
65 |
84 |
|
85 \solution{ |
|
86 Atomic parsers look at a concrete prefix of the input, like num-tokens or identifiers. |
|
87 Map-parsers apply a function to the output of a parser. In this way you can transform |
|
88 the output from, for example, a string to an integer. |
|
89 } |
|
90 |
|
91 |
66 \item Parser combinators can directly be given a string as |
92 \item Parser combinators can directly be given a string as |
67 input, without the need of a lexer. What are the |
93 input, without the need of a lexer. What are the |
68 advantages to first lex a string and then feed a |
94 advantages to first lex a string and then feed a |
69 sequence of tokens as input to the parser? |
95 sequence of tokens as input to the parser? |
70 |
96 |
71 % Reason 1 you can filter out whitespaces and comments, which makes the grammar rules simpler. If you have to make sure that a whitespace comes after a variable say, then your parser rule for variables gets more complicated. Same with comments which do not contribute anything to the parser tree. |
97 \solution{ Reason 1 you can filter out whitespaces and comments, |
72 % Reason 2) The lexer can already classify tokens, for example as numbers, keywords or identifiers. This again makes the grammar rules more deterministic and as a result faster to parse. |
98 which makes the grammar rules simpler. If you have to make |
73 % The point is that parser combinators can be used to process strings, but in case of compilers where whitespaces and comments need to be filtered out, the lexing phase is actually quite useful. |
99 sure that a whitespace comes after a variable say, then your |
|
100 parser rule for variables gets more complicated. Same with |
|
101 comments which do not contribute anything to the parse tree. |
|
102 |
|
103 Reason 2) The lexer can already classify tokens, for example |
|
104 as numbers, keywords or identifiers. This again makes the grammar |
|
105 rules more deterministic and as a result faster to parse. |
74 |
106 |
|
107 The point is that parser combinators can be used to process |
|
108 strings, but in case of compilers where whitespaces and |
|
109 comments need to be filtered out, the lexing phase is actually |
|
110 quite useful. } |
75 |
111 |
76 \item The injection function for sequence regular expressions is defined |
112 \item The injection function for sequence regular expressions is defined |
77 by three clauses: |
113 by three clauses: |
78 |
114 |
79 \begin{center} |
115 \begin{center} |
82 $\inj\,(r_1 \cdot r_2)\,c\,\,\Left(Seq(v_1,v_2))$ & $\dn$ & $Seq(\inj\,r_1\,c\,v_1,v_2)$\\ |
118 $\inj\,(r_1 \cdot r_2)\,c\,\,\Left(Seq(v_1,v_2))$ & $\dn$ & $Seq(\inj\,r_1\,c\,v_1,v_2)$\\ |
83 $\inj\,(r_1 \cdot r_2)\,c\,\,Right(v)$ & $\dn$ & $Seq(\textit{mkeps}(r_1),\inj\,r_2\,c\,v)$\\ |
119 $\inj\,(r_1 \cdot r_2)\,c\,\,Right(v)$ & $\dn$ & $Seq(\textit{mkeps}(r_1),\inj\,r_2\,c\,v)$\\ |
84 \end{tabular} |
120 \end{tabular} |
85 \end{center} |
121 \end{center} |
86 |
122 |
87 Explain why there are three cases in the injection function for sequence |
123 Explain why there are three cases in the injection function for sequence |
88 regular expressions. |
124 regular expressions. |
|
125 |
|
126 \solution{ |
|
127 This is because the derivative of sequences can be of the form |
|
128 |
|
129 \begin{itemize} |
|
130 \item $(der\,c\,r_1)\cdot r_2$ |
|
131 \item $(der\,c\,r_1)\cdot r_2 \;+\; der\,c\,r_2$ |
|
132 \end{itemize} |
|
133 |
|
134 In the first case the value needs to be of the form $Seq$, in the second case $\Left$ or $Right$. |
|
135 Therefore 3 cases. |
|
136 } |
89 |
137 |
90 \item \POSTSCRIPT |
138 \item \POSTSCRIPT |
91 \end{enumerate} |
139 \end{enumerate} |
92 |
140 |
93 \end{document} |
141 \end{document} |