\documentclass{article}+ −
\usepackage{../style}+ −
\usepackage{../langs}+ −
+ −
\begin{document}+ −
+ −
\section*{Coursework 2 (Strand 1)}+ −
+ −
\noindent This coursework is worth 5\% and is due on 6+ −
November at 16:00. You are asked to implement the Sulzmann \&+ −
Lu tokeniser for the WHILE language. You can do the+ −
implementation in any programming language you like, but you+ −
need to submit the source code with which you answered the+ −
questions, otherwise a mark of 0\% will be awarded. However,+ −
the coursework will \emph{only} be judged according to the+ −
answers. You can submit your answers in a txt-file or as pdf.+ −
+ −
\subsection*{Disclaimer}+ −
+ −
It should be understood that the work you submit represents+ −
your own effort. You have not copied from anyone else. An+ −
exception is the Scala code from KEATS and the code I showed+ −
during the lectures, which you can both use. You can also use+ −
your own code from the CW~1.+ −
+ −
\subsection*{Question 1 (marked with 1\%)}+ −
+ −
To implement a tokeniser for the WHILE language, you first+ −
need to design the appropriate regular expressions for the+ −
following eight syntactic entities:+ −
+ −
\begin{enumerate}+ −
\item keywords are+ −
+ −
\begin{quote}+ −
\texttt{while}, + −
\texttt{if}, + −
\texttt{then}, + −
\texttt{else}, + −
\texttt{do}, + −
\texttt{for}, + −
\texttt{to}, + −
\texttt{true}, + −
\texttt{false}, + −
\texttt{read}, + −
\texttt{write},+ −
\texttt{skip}+ −
\end{quote} + −
+ −
\item operators are+ −
+ −
\begin{quote}+ −
\texttt{+}, + −
\texttt{-}, + −
\texttt{*}, + −
\texttt{\%},+ −
\texttt{/},+ −
\texttt{==}, + −
\texttt{!=}, + −
\texttt{>}, + −
\texttt{<}, + −
\texttt{:=},+ −
\texttt{\&\&},+ −
\texttt{||}+ −
\end{quote} + −
+ −
\item strings are enclosed by \texttt{"\ldots"} + −
\item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}}+ −
\item there are semicolons \texttt{;}+ −
\item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n}+ −
\item identifiers are letters followed by underscores \texttt{\_\!\_}, letters+ −
or digits+ −
\item numbers are \texttt{0}, \text{1}, \ldots+ −
\end{enumerate}+ −
+ −
\noindent+ −
You can use the basic regular expressions + −
+ −
\[+ −
\varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*+ −
\]+ −
+ −
\noindent+ −
but also the following extended regular expressions+ −
+ −
\begin{center}+ −
\begin{tabular}{ll}+ −
$[c_1 c_2 \ldots c_n]$ & a range of characters\\+ −
$r^+$ & one or more times $r$\\+ −
$r^?$ & optional $r$\\+ −
$r^{\{n\}}$ & n-times $r$\\+ −
\end{tabular}+ −
\end{center}+ −
+ −
\noindent+ −
+ −
\subsection*{Question 2 (marked with 3\%)}+ −
+ −
Implement the Sulzmann \& Lu tokeniser from the lectures. For+ −
this you need to implement the functions $nullable$ and $der$+ −
(you can use your code from CW 1), as well as $mkeps$ and+ −
$inj$. These functions need to be appropriately extended for+ −
the extended regular expressions from Q1. Also add the record+ −
regular expression from the lectures and implement a function,+ −
say \pcode{env}, that returns all assignments from a value+ −
(such that you can extract easily the tokens from a value). + −
+ −
The functions $mkeps$ and $inj$ return values. Calculate the+ −
value for your regular expressions from Q1 and the string+ −
+ −
\begin{center}+ −
\code{"read n;"}+ −
\end{center} + −
+ −
\noindent+ −
and use your \pcode{env} function to give the token sequence.+ −
+ −
+ −
\subsection*{Question 3 (marked with 1\%)}+ −
+ −
Extend your tokenizer from Q2 to also simplify regular expressions+ −
after each derivation step and rectify the computed values after each+ −
injection. Use this tokenizer to tokenize the programs in+ −
Figure~\ref{fib} and \ref{loop}.+ −
+ −
+ −
\begin{figure}[p]+ −
\begin{center}+ −
\mbox{\lstinputlisting[language=while]{../progs/fib.while}}+ −
\end{center}+ −
\caption{Fibonacci program in the WHILE language.\label{fib}}+ −
\end{figure}+ −
+ −
\begin{figure}[p]+ −
\begin{center}+ −
\mbox{\lstinputlisting[language=while]{../progs/loops.while}}+ −
\end{center}+ −
\caption{The three-nested-loops program in the WHILE language. + −
Usually used for timing measurements.\label{loop}}+ −
\end{figure}+ −
+ −
\end{document}+ −
+ −
%%% Local Variables: + −
%%% mode: latex+ −
%%% TeX-master: t+ −
%%% End: + −