178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
1 |
\documentclass{article}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
2 |
\usepackage{../style}
|
216
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
3 |
\usepackage{../langs}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
4 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
5 |
\begin{document}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
6 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
7 |
\section*{Coursework 2 (Strand 1)}
|
198
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
8 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
9 |
\noindent
|
328
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
10 |
This coursework is worth 5\% and is due on 6 November at 16:00. You
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
11 |
are asked to implement the Sulzmann tokeniser for the WHILE language.
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
12 |
You need to submit a document containing the answers for the questions
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
13 |
below. You can do the implementation in any programming language you
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
14 |
like, but you need to submit the source code with which you answered
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
15 |
the questions. However, the coursework will \emph{only} be judged
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
16 |
according to the answers. You can submit your answers in a txt-file or
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
17 |
as pdf.
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
18 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
19 |
\subsection*{Disclaimer}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
20 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
21 |
It should be understood that the work you submit represents your own
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
22 |
effort. You have not copied from anyone else. An exception is the
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
23 |
Scala code I showed during the lectures, which you can use.
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
24 |
You can also use your own code from the CW~1.
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
25 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
26 |
\subsection*{Question 1 (marked with 1\%)}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
27 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
28 |
To implement a tokeniser for the WHILE language, you first need to design
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
29 |
the appropriate regular expressions for the following eight syntactic entities:
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
30 |
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
31 |
\begin{enumerate}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
32 |
\item keywords are
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
33 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
34 |
\begin{quote}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
35 |
\texttt{while},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
36 |
\texttt{if},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
37 |
\texttt{then},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
38 |
\texttt{else},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
39 |
\texttt{do},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
40 |
\texttt{for},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
41 |
\texttt{to},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
42 |
\texttt{true},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
43 |
\texttt{false},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
44 |
\texttt{read},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
45 |
\texttt{write},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
46 |
\texttt{skip}
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
47 |
\end{quote}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
48 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
49 |
\item operators are
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
50 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
51 |
\begin{quote}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
52 |
\texttt{+},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
53 |
\texttt{-},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
54 |
\texttt{*},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
55 |
\texttt{\%},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
56 |
\texttt{/},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
57 |
\texttt{==},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
58 |
\texttt{!=},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
59 |
\texttt{>},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
60 |
\texttt{<},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
61 |
\texttt{:=},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
62 |
\texttt{\&\&},
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
63 |
\texttt{||}
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
64 |
\end{quote}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
65 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
66 |
\item strings are enclosed by \texttt{"\ldots"}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
67 |
\item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
68 |
\item there are semicolons \texttt{;}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
69 |
\item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n}
|
180
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
70 |
\item identifiers are letters followed by underscores \texttt{\_\!\_}, letters
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
71 |
or digits
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
72 |
\item numbers are \texttt{0}, \text{1}, \ldots
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
73 |
\end{enumerate}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
74 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
75 |
\noindent
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
76 |
You can use the basic regular expressions
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
77 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
78 |
\[
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
79 |
\varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
80 |
\]
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
81 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
82 |
\noindent
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
83 |
but also the following extended regular expressions
|
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
84 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
85 |
\begin{center}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
86 |
\begin{tabular}{ll}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
87 |
$[c_1 c_2 \ldots c_n]$ & a range of characters\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
88 |
$r^+$ & one or more times $r$\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
89 |
$r^?$ & optional $r$\\
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
90 |
$r^{\{n\}}$ & n-times $r$\\
|
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
91 |
\end{tabular}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
92 |
\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
93 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
94 |
\noindent
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
95 |
Once you have designed all regular expressions for 1 - 8, then
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
96 |
give the token sequence for the Fibonacci program shown below in Fig.~\ref{fib}.
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
97 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
98 |
\subsection*{Question 2 (marked with 3\%)}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
99 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
100 |
Implement the Sulzmann tokeniser from the lectures. For this you need
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
101 |
to implement the functions $nullable$ and $der$ (you can use your code
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
102 |
from CW 1), as well as $mkeps$ and $inj$. These functions need to be
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
103 |
appropriately extended for the extended regular expressions from
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
104 |
Q1. Also add the record regular expression from the lectures and
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
105 |
implement a function, say \pcode{env}, that returns all assignments
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
106 |
from a value (such that you can extract easily the tokens from a
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
107 |
value).
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
108 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
109 |
The functions $mkeps$ and $inj$ return values. Calculate
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
110 |
the value for your regular expressions from Q1 and the string
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
111 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
112 |
\begin{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
113 |
\code{"read n;"}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
114 |
\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
115 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
116 |
\noindent
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
117 |
and use your \pcode{env} function to give the token sequence.
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
118 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
119 |
\subsection*{Question 3 (marked with 1\%)}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
120 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
121 |
Extend your tokenizer from Q2 to also simplify regular expressions
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
122 |
after each derivation step and rectify the computed values after each
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
123 |
injection. Use this tokenizer to tokenize the programs in
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
124 |
Figure~\ref{fib} and \ref{loop}.
|
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
125 |
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
126 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
127 |
\begin{figure}[p]
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
128 |
\begin{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
129 |
\mbox{\lstinputlisting[language=while]{../progs/fib.while}}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
130 |
\end{center}
|
181
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
131 |
\caption{Fibonacci program in the WHILE language.\label{fib}}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
132 |
\end{figure}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
133 |
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
134 |
\begin{figure}[p]
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
135 |
\begin{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
136 |
\mbox{\lstinputlisting[language=while]{../progs/loops.while}}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
137 |
\end{center}
|
275
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
138 |
\caption{The three-nested-loops program in the WHILE language.
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
139 |
Usually used for timing measurements.\label{loop}}
|
181
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
140 |
\end{figure}
|
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
141 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
142 |
\end{document}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
143 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
144 |
%%% Local Variables:
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
145 |
%%% mode: latex
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
146 |
%%% TeX-master: t
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
147 |
%%% End:
|