178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
1 |
\documentclass{article}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
2 |
\usepackage{../style}
|
216
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
3 |
\usepackage{../langs}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
4 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
5 |
\begin{document}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
6 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
7 |
\section*{Coursework 2 (Strand 1)}
|
198
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
8 |
|
500
|
9 |
\noindent This coursework is worth 5\% and is due on 3
|
358
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
10 |
November at 16:00. You are asked to implement the Sulzmann \&
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
11 |
Lu lexer for the WHILE language. You can do the
|
358
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
12 |
implementation in any programming language you like, but you
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
13 |
need to submit the source code with which you answered the
|
395
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
14 |
questions, otherwise a mark of 0\% will be awarded. You can
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
15 |
submit your answers in a txt-file or as pdf. Code submit as
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
16 |
code.
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
17 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
18 |
\subsection*{Disclaimer}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
19 |
|
358
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
20 |
It should be understood that the work you submit represents
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
21 |
your own effort. You have not copied from anyone else. An
|
363
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
22 |
exception is the Scala code from KEATS and the code I showed
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
23 |
during the lectures, which you can both freely use. You can
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
24 |
also use your own code from the CW~1.
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
25 |
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
26 |
\subsection*{Question 1}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
27 |
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
28 |
To implement a lexer for the WHILE language, you first
|
358
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
29 |
need to design the appropriate regular expressions for the
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
30 |
following eight syntactic entities:
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
31 |
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
32 |
\begin{enumerate}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
33 |
\item keywords are
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
34 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
35 |
\begin{quote}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
36 |
\texttt{while},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
37 |
\texttt{if},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
38 |
\texttt{then},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
39 |
\texttt{else},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
40 |
\texttt{do},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
41 |
\texttt{for},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
42 |
\texttt{to},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
43 |
\texttt{true},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
44 |
\texttt{false},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
45 |
\texttt{read},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
46 |
\texttt{write},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
47 |
\texttt{skip}
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
48 |
\end{quote}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
49 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
50 |
\item operators are
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
51 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
52 |
\begin{quote}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
53 |
\texttt{+},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
54 |
\texttt{-},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
55 |
\texttt{*},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
56 |
\texttt{\%},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
57 |
\texttt{/},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
58 |
\texttt{==},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
59 |
\texttt{!=},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
60 |
\texttt{>},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
61 |
\texttt{<},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
62 |
\texttt{:=},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
63 |
\texttt{\&\&},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
64 |
\texttt{||}
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
65 |
\end{quote}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
66 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
67 |
\item strings are enclosed by \texttt{"\ldots"}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
68 |
\item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
69 |
\item there are semicolons \texttt{;}
|
447
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
70 |
\item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n} or
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
71 |
\texttt{$\backslash$t}
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
72 |
\item identifiers are letters followed by underscores \texttt{\_\!\_}, letters
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
73 |
or digits
|
396
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
74 |
\item numbers are \pcode{0}, \pcode{1}, \ldots and so on; give
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
75 |
a regular expression that can recognise \pcode{0}, but not numbers
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
76 |
with leading zeroes, such as \pcode{001}
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
77 |
\end{enumerate}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
78 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
79 |
\noindent
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
80 |
You can use the basic regular expressions
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
81 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
82 |
\[
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
83 |
\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
84 |
\]
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
85 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
86 |
\noindent
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
87 |
but also the following extended regular expressions
|
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
88 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
89 |
\begin{center}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
90 |
\begin{tabular}{ll}
|
494
|
91 |
$[c_1,c_2,\ldots,c_n]$ & a set of characters\\
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
92 |
$r^+$ & one or more times $r$\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
93 |
$r^?$ & optional $r$\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
94 |
$r^{\{n\}}$ & n-times $r$\\
|
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
95 |
\end{tabular}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
96 |
\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
97 |
|
458
|
98 |
\noindent
|
473
|
99 |
Later on you will also need the record regular expression:
|
458
|
100 |
|
|
101 |
\begin{center}
|
|
102 |
\begin{tabular}{ll}
|
|
103 |
$REC(x:r)$ & record regular expression\\
|
|
104 |
\end{tabular}
|
|
105 |
\end{center}
|
|
106 |
|
396
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
107 |
\noindent Try to design your regular expressions to be as
|
494
|
108 |
small as possible. For example you should use character sets
|
|
109 |
for identifiers and numbers. Feel free to use the general
|
|
110 |
character constructor \textit{CFUN} introduced in CW 1.
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
111 |
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
112 |
\subsection*{Question 2}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
113 |
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
114 |
Implement the Sulzmann \& Lu lexer from the lectures. For
|
358
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
115 |
this you need to implement the functions $nullable$ and $der$
|
369
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
116 |
(you can use your code from CW~1), as well as $mkeps$ and
|
358
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
117 |
$inj$. These functions need to be appropriately extended for
|
369
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
118 |
the extended regular expressions from Q1. Write down the
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
119 |
clauses for
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
120 |
|
369
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
121 |
\begin{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
122 |
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
|
494
|
123 |
$mkeps([c_1,c_2,\ldots,c_n])$ & $\dn$ & $?$\\
|
369
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
124 |
$mkeps(r^+)$ & $\dn$ & $?$\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
125 |
$mkeps(r^?)$ & $\dn$ & $?$\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
126 |
$mkeps(r^{\{n\}})$ & $\dn$ & $?$\medskip\\
|
494
|
127 |
$inj\, ([c_1,c_2,\ldots,c_n])\,c\,\ldots$ & $\dn$ & $?$\\
|
369
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
128 |
$inj\, (r^+)\,c\,\ldots$ & $\dn$ & $?$\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
129 |
$inj\, (r^?)\,c\,\ldots$ & $\dn$ & $?$\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
130 |
$inj\, (r^{\{n\}})\,c\,\ldots$ & $\dn$ & $?$\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
131 |
\end{tabular}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
132 |
\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
133 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
134 |
\noindent where $inj$ takes three arguments: a regular
|
396
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
135 |
expression, a character and a value. Test your lexer code
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
136 |
with at least the two small examples below:
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
137 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
138 |
\begin{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
139 |
\begin{tabular}{ll}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
140 |
regex: & string:\smallskip\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
141 |
$a^{\{3\}}$ & $aaa$\\
|
458
|
142 |
$(a + \ONE)^{\{3\}}$ & $aa$
|
396
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
143 |
\end{tabular}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
144 |
\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
145 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
146 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
147 |
\noindent Both strings should be sucessfully lexed by the
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
148 |
respective regular expression, that means the lexer returns
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
149 |
in both examples a value.
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
150 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
151 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
152 |
Also add the record regular expression from the
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
153 |
lectures to your lexer and implement a function, say
|
396
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
154 |
\pcode{env}, that returns all assignments from a value (such
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
155 |
that you can extract easily the tokens from a value).\medskip
|
369
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
156 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
157 |
\noindent
|
384
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
158 |
Finally give the tokens for your regular expressions from Q1 and the
|
369
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
159 |
string
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
160 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
161 |
\begin{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
162 |
\code{"read n;"}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
163 |
\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
164 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
165 |
\noindent
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
166 |
and use your \pcode{env} function to give the token sequence.
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
167 |
|
333
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
168 |
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
169 |
\subsection*{Question 3}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
170 |
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
171 |
Extend your lexer from Q2 to also simplify regular expressions
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
172 |
after each derivation step and rectify the computed values after each
|
419
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
173 |
injection. Use this lexer to tokenize the programs in
|
364
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
174 |
Figure~\ref{fib} and \ref{loop}. Give the tokens of these
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
175 |
programs where whitespaces are filtered out.
|
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
176 |
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
177 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
178 |
\begin{figure}[p]
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
179 |
\mbox{\lstinputlisting[language=while]{../progs/fib.while}}
|
181
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
180 |
\caption{Fibonacci program in the WHILE language.\label{fib}}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
181 |
\end{figure}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
182 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
183 |
\begin{figure}[p]
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
184 |
\mbox{\lstinputlisting[language=while]{../progs/loops.while}}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
185 |
\caption{The three-nested-loops program in the WHILE language.
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
186 |
Usually used for timing measurements.\label{loop}}
|
181
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
187 |
\end{figure}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
188 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
189 |
\end{document}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
190 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
191 |
%%% Local Variables:
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
192 |
%%% mode: latex
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
193 |
%%% TeX-master: t
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
194 |
%%% End:
|