hws/hw06.tex
author Christian Urban <christian.urban@kcl.ac.uk>
Sat, 09 Sep 2023 14:14:31 +0100
changeset 916 10f834eb0a9e
parent 898 45a48c47dcca
child 954 eda0ccf56c72
permissions -rw-r--r--
texupdate
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
56
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     1
\documentclass{article}
292
7ed2a25dd115 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 102
diff changeset
     2
\usepackage{../style}
7ed2a25dd115 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 102
diff changeset
     3
\usepackage{../graphics}
56
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     4
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     5
\begin{document}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     6
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     7
\section*{Homework 6}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     8
916
10f834eb0a9e texupdate
Christian Urban <christian.urban@kcl.ac.uk>
parents: 898
diff changeset
     9
%\HEADER
359
db106e5b7c4d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 292
diff changeset
    10
56
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    11
\begin{enumerate}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    12
\item (i) Give the regular expressions for lexing a language
401
5d85dc9779b1 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 359
diff changeset
    13
      consisting of whitespaces, identifiers (some letters
5d85dc9779b1 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 359
diff changeset
    14
      followed by digits), numbers, operations \texttt{=},
5d85dc9779b1 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 359
diff changeset
    15
      \texttt{<} and \texttt{>}, and the keywords \texttt{if},
5d85dc9779b1 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 359
diff changeset
    16
      \texttt{then} and \texttt{else}. (ii) Decide whether the
5d85dc9779b1 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 359
diff changeset
    17
      following strings can be lexed in this language?
56
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    18
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    19
\begin{enumerate}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    20
\item \texttt{"if y4 = 3 then 1 else 3"}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    21
\item \texttt{"if33 ifif then then23 else else 32"}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    22
\item \texttt{"if x4x < 33 then 1 else 3"}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    23
\end{enumerate}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    24
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    25
In case they can, give the corresponding token sequences. (Hint: 
89
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 56
diff changeset
    26
Observe the maximal munch rule and priorities of your regular
56
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    27
expressions that make the process of lexing unambiguous.)
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    28
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    29
\item Suppose the grammar
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    30
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    31
\begin{center}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    32
\begin{tabular}{lcl}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    33
$E$ & $\rightarrow$ &  $F \;|\; F \cdot * \cdot F \;|\; F \cdot \backslash \cdot F$\\
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    34
$F$ & $\rightarrow$ & $T \;|\; T \cdot \texttt{+} \cdot T \;|\; T \cdot \texttt{-} \cdot T$\\
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    35
$T$ & $\rightarrow$ & $num \;|\; \texttt{(} \cdot E \cdot \texttt{)}$\\
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    36
\end{tabular}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    37
\end{center}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    38
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    39
where $E$, $F$ and $T$ are non-terminals, $E$ is the starting symbol of the grammar, and $num$ stands for
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    40
a number token. Give a parse tree for the string \texttt{(3+3)+(2*3)}.
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    41
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    42
\item Define what it means for a grammar to be ambiguous. Give an example of
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    43
an ambiguous grammar.
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    44
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    45
\item Suppose boolean expressions are built up from
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    46
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    47
\begin{center}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    48
\begin{tabular}{ll}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    49
1.) & tokens for \texttt{true} and \texttt{false},\\
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    50
2.) & the infix operations \texttt{$\wedge$} and \texttt{$\vee$},\\
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    51
3.) & the prefix operation $\neg$, and\\
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    52
4.) & can be enclosed in parentheses.
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    53
\end{tabular}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    54
\end{center}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    55
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    56
(i) Give a grammar that can recognise such boolean expressions
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    57
and (ii) give a sample string involving all rules given in 1.-4.~that 
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    58
can be parsed by this grammar.
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    59
459
780486571e38 updated
Christian Urban <urbanc@in.tum.de>
parents: 401
diff changeset
    60
721
e3c64f22dd31 added slides from Rochester
Christian Urban <christian.urban@kcl.ac.uk>
parents: 647
diff changeset
    61
647
180600c04da2 updated
Christian Urban <urbanc@in.tum.de>
parents: 596
diff changeset
    62
\item Parsing combinators consist of atomic parsers, alternative
180600c04da2 updated
Christian Urban <urbanc@in.tum.de>
parents: 596
diff changeset
    63
  parsers, sequence parsers and semantic actions.  What is the purpose
180600c04da2 updated
Christian Urban <urbanc@in.tum.de>
parents: 596
diff changeset
    64
  of (1) atomic parsers and of (2) semantic actions?
180600c04da2 updated
Christian Urban <urbanc@in.tum.de>
parents: 596
diff changeset
    65
726
fba480bbc9f7 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 721
diff changeset
    66
\item Parser combinators can directly be given a string as
fba480bbc9f7 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 721
diff changeset
    67
      input, without the need of a lexer. What are the
fba480bbc9f7 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 721
diff changeset
    68
      advantages to first lex a string and then feed a
fba480bbc9f7 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 721
diff changeset
    69
      sequence of tokens as input to the parser?
fba480bbc9f7 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 721
diff changeset
    70
898
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 894
diff changeset
    71
% Reason 1 you can filter out whitespaces and comments, which makes the grammar rules simpler. If you have to make sure that a whitespace comes after a variable say, then your parser rule for variables  gets more complicated. Same with comments which do not contribute anything to the parser tree.
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 894
diff changeset
    72
% Reason 2) The lexer can already classify tokens, for example as numbers, keywords or identifiers. This again makes the grammar rules more deterministic and as a result faster to parse.
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 894
diff changeset
    73
% The point is that parser combinators can be used to process strings, but in case of compilers where whitespaces and comments need to be filtered out, the lexing phase is actually quite useful.
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 894
diff changeset
    74
45a48c47dcca updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 894
diff changeset
    75
      
894
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    76
\item The injection function for sequence regular expressions is defined
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    77
      by three clauses:
726
fba480bbc9f7 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 721
diff changeset
    78
894
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    79
\begin{center}
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    80
\begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    81
  $\inj\,(r_1 \cdot r_2)\,c\,\,Seq(v_1,v_2)$ & $\dn$  & $Seq(\inj\,r_1\,c\,v_1,v_2)$\\
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    82
  $\inj\,(r_1 \cdot r_2)\,c\,\,\Left(Seq(v_1,v_2))$ & $\dn$  & $Seq(\inj\,r_1\,c\,v_1,v_2)$\\
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    83
  $\inj\,(r_1 \cdot r_2)\,c\,\,Right(v)$ & $\dn$  & $Seq(\textit{mkeps}(r_1),\inj\,r_2\,c\,v)$\\
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    84
\end{tabular}
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    85
\end{center}
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    86
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    87
Explain why there are three cases in the injection function for sequence
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    88
regular expressions. 
02ef5c3abc51 updatedHG: added solutions/cw5/fun_tokens.sc
Christian Urban <christian.urban@kcl.ac.uk>
parents: 726
diff changeset
    89
      
459
780486571e38 updated
Christian Urban <urbanc@in.tum.de>
parents: 401
diff changeset
    90
\item \POSTSCRIPT        
56
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    91
\end{enumerate}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    92
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    93
\end{document}
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    94
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    95
%%% Local Variables: 
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    96
%%% mode: latex
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    97
%%% TeX-master: t
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    98
%%% End: