diff -r 21f0f24424ab -r 618c7640cf66 coursework/cw02.tex
--- a/coursework/cw02.tex	Sun Oct 12 22:36:18 2014 +0100
+++ b/coursework/cw02.tex	Mon Oct 13 00:09:28 2014 +0100
@@ -1,127 +1,142 @@
 \documentclass{article}
-\usepackage{hyperref}
-\usepackage{amssymb}
-\usepackage{amsmath}
+\usepackage{../style}
 \usepackage{../langs}
 
-\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}%
 \begin{document}
 
-\section*{Coursework 2}
-
-{\bf UPDATE:} There was a typo in Q1 about the regular expressions for comments:
-they should, of course, start with $\slash\slash$, as in C for example, not with 
-$\backslash\backslash$. (Thanks to Bryan who pointed out this error.)\bigskip\bigskip
+\section*{Coursework 2 (Strand 1)}
 
 \noindent
-This coursework is worth 3\% and is due on 29 November at 16:00. You are asked to 
-
-\begin{enumerate}
-\item implement a tokeniser for the WHILE language,
-\item implement a parser and an evaluator for boolean and arithmetic expressions, and
-\item write a WHILE program for printing out prime numbers.
-\end{enumerate}
+This coursework is worth 5\% and is due on 3 November at 16:00. You
+are asked to implement the Sulzmann tokeniser for the WHILE language.
+You need to submit a document containing the answers for the questions
+below. You can do the implementation in any programming language you
+like, but you need to submit the source code with which you answered
+the questions. However, the coursework will \emph{only} be judged
+according to the answers. You can submit your answers in a txt-file or
+as pdf.
 
-\noindent
-You need to submit a document containing the answers for the questions 
-below. You can do the implementation in any programming language you like, but you need 
-to submit the source code with which you answered the questions. However, the coursework 
-will \emph{only} be judged according to the answers. You can submit your answers
-in a txt-file or as pdf.\bigskip
+\subsection*{Disclaimer}
 
+It should be understood that the work you submit represents your own
+effort.  You have not copied from anyone else. An exception is the
+Scala code I showed during the lectures, which you can use.
+You can also use your own code from the CW~1.
 
 \subsection*{Question 1 (marked with 1\%)}
 
-Implement a tokeniser for the WHILE language. You first need to design the appropriate
-regular expressions for the following nine syntactic entities:
+To implement a tokeniser for the WHILE language, you first need to design 
+the appropriate regular expressions for the following eight syntactic entities:
 
 \begin{enumerate}
 \item keywords are
 
 \begin{quote}
-\texttt{while}, \texttt{if}, \texttt{then}, \texttt{else}, \texttt{do}, \texttt{for}, \texttt{to}, \texttt{true}, \texttt{false}
-\texttt{andalso}, \texttt{orelse}, \texttt{read}, \texttt{write}
+\texttt{while}, 
+\texttt{if}, 
+\texttt{then}, 
+\texttt{else}, 
+\texttt{do}, 
+\texttt{for}, 
+\texttt{to}, 
+\texttt{true}, 
+\texttt{false}, 
+\texttt{read}, 
+\texttt{write},
+\texttt{skip}
 \end{quote} 
 
 \item operators are
 
 \begin{quote}
-\texttt{+}, \texttt{-}, \texttt{*}, \texttt{\%}, \texttt{==}, \texttt{!=}, \texttt{>}, \texttt{<}, \texttt{:=}
+\texttt{+}, 
+\texttt{-}, 
+\texttt{*}, 
+\texttt{\%},
+\texttt{/},
+\texttt{==}, 
+\texttt{!=}, 
+\texttt{>}, 
+\texttt{<}, 
+\texttt{:=},
+\texttt{\&\&},
+\texttt{||}
 \end{quote} 
 
 \item strings are enclosed by \texttt{"\ldots"} 
 \item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}}
 \item there are semicolons \texttt{;}
-\item whitespaces are either \texttt{" "} or \texttt{$\backslash$n}
-\item comments either start with $\slash\,\slash$ and run to the end of the corresponding line 
-(indicated by \texttt{$\backslash$n}), or they can also run over several lines but then need to be enclosed by
-$\slash\texttt{*}$ as the beginning marker and $\texttt{*}\slash{}$\smallskip as the end marker
+\item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n}
 \item identifiers are letters followed by underscores \texttt{\_\!\_}, letters
 or digits
 \item numbers are \texttt{0}, \text{1}, \ldots
 \end{enumerate}
 
 \noindent
-Once you have implemented all regular expressions for 1 - 9, then
-give the token sequence for the Fibonacci program shown below in Fig.~\ref{fib}.
-
-\subsection*{Question 2 (marked with 1\%)}
+You can use the basic regular expressions 
 
-Implement parser combinators and an evaluation function for arithmetic and boolean
-expressions.  Arithmetic operations should include \texttt{+}, \texttt{-}, \texttt{*} and 
-\texttt{\%} (quotient). Boolean
-operations should include \texttt{==} (equal), \texttt{!=} (unequal), \texttt{<} and  
-\texttt{>}.
- 
-Using the parser and evaluation function, calculate the values for
+\[
+\varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*
+\]
 
-\begin{itemize}
-\item \texttt{17 < 3 * 3 * 3}
-\item \texttt{(29 - 20) * 3}
-\item \texttt{79 - 20 * 3}
-\item \texttt{2 * 2 != 12 \% 3}
-\end{itemize} 
- 
-\subsection*{Question 3 (marked with 1\%)}
-
-Write a program in the WHILE programming language that prints out all prime numbers between
-0 and a fixed number (say 100). A partial grammar of the WHILE language is given below. 
+\noindent
+but also the following extended regular expressions
 
 \begin{center}
-\begin{tabular}{@{}lcl@{}}
-\textit{Stmt} & $\rightarrow$ &  $\texttt{skip}$\\
-              & $|$ & \textit{Id}\;\texttt{:=}\;\textit{AExp}\\
-              & $|$ & \texttt{if}\; \textit{BExp} \;\texttt{then}\; \textit{Block} \;\texttt{else}\; \textit{Block}\\
-              & $|$ & \texttt{while}\; \textit{BExp} \;\texttt{do}\; \textit{Block}\\
-              & $|$ & \texttt{read}\;\textit{Id}\\
-              & $|$ & \texttt{write}\;\textit{Id}\\
-              & $|$ & \texttt{write}\;\textit{String}\medskip\\
-\textit{Stmts} & $\rightarrow$ &  \textit{Stmt} \;\texttt{;}\; \textit{Stmts}\\
-              & $|$ & \textit{Stmt}\medskip\\
-\textit{Block} & $\rightarrow$ &  \texttt{\{}\,\textit{Stmts}\,\texttt{\}}\\
-                & $|$ & \textit{Stmt}\medskip\\
-\textit{AExp} & $\rightarrow$ & \ldots\\
-\textit{BExp} & $\rightarrow$ & \ldots\\
+\begin{tabular}{ll}
+$[c_1 c_2 \ldots c_n]$ & a range of characters\\
+$r^+$ & one or more times $r$\\
+$r^?$ & optional $r$\\
+$r^{\{n\}}$ & n-times $r$\\
 \end{tabular}
 \end{center}
 
 \noindent
-As another guidance for your program have a look at the Fibonacci program
-and ``three-nested-loops'' program shown below in Figures \ref{fib} and \ref{loop}. 
+Once you have designed all regular expressions for 1 - 8, then
+give the token sequence for the Fibonacci program shown below in Fig.~\ref{fib}.
+
+\subsection*{Question 2 (marked with 3\%)}
+
+Implement the Sulzmann tokeniser from the lectures. For this you need
+to implement the functions $nullable$ and $der$ (you can use your code
+from CW 1), as well as $mkeps$ and $inj$. These functions need to be
+appropriately extended for the extended regular expressions from
+Q1. Also add the record regular expression from the lectures and
+implement a function, say \pcode{env}, that returns all assignments
+from a value (such that you can extract easily the tokens from a
+value). 
+
+The functions $mkeps$ and $inj$ return values. Calculate
+the value for your regular expressions from Q1 and the string
+
+\begin{center}
+\code{"read n;"}
+\end{center} 
+
+\noindent
+and use your \pcode{env} function to give the token sequence.
+
+\subsection*{Question 3 (marked with 1\%)}
+
+Extend your tokenizer from Q2 to also simplify regular expressions
+after each derivation step and rectify the computed values after each
+injection. Use this tokenizer to tokenize the programs in
+Figure~\ref{fib} and \ref{loop}.
 
 
-\begin{figure}[h]
+\begin{figure}[p]
 \begin{center}
 \mbox{\lstinputlisting[language=while]{../progs/fib.while}}
 \end{center}
 \caption{Fibonacci program in the WHILE language.\label{fib}}
 \end{figure}
 
-\begin{figure}[h]
+\begin{figure}[p]
 \begin{center}
 \mbox{\lstinputlisting[language=while]{../progs/loops.while}}
 \end{center}
-\caption{The three-nested-loops program in the WHILE language. Usually used for timing measurements.\label{loop}}
+\caption{The three-nested-loops program in the WHILE language. 
+Usually used for timing measurements.\label{loop}}
 \end{figure}
 
 \end{document}