diff -r e74c696821a2 -r 010c5a03dca2 coursework/cw01.tex --- a/coursework/cw01.tex Tue Aug 23 23:12:55 2016 +0200 +++ b/coursework/cw01.tex Wed Sep 14 11:02:44 2016 +0100 @@ -6,13 +6,14 @@ \section*{Coursework 1 (Strand 1)} -This coursework is worth 5\% and is due on 20 October at +This coursework is worth 4\% and is due on 20 October at 16:00. You are asked to implement a regular expression matcher and submit a document containing the answers for the questions below. You can do the implementation in any programming language you like, but you need to submit the source code with which you answered the questions, otherwise a mark of 0\% will be awarded. You can submit your answers in a txt-file or pdf. +Code send as code. @@ -21,17 +22,17 @@ It should be understood that the work you submit represents your own effort. You have not copied from anyone else. An exception is the Scala code I showed during the lectures or -uploaded to KEATS, which you can use.\bigskip +uploaded to KEATS, which you can freely use.\bigskip -\subsubsection*{Tasks} +\subsubsection*{Task} The task is to implement a regular expression matcher based on -derivatives. The implementation should be able to deal with the usual -(basic) regular expressions +derivatives of regular expressions. The implementation should +be able to deal with the usual (basic) regular expressions \[ -\varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^* +\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^* \] \noindent @@ -47,9 +48,9 @@ \end{tabular} \end{center} -\noindent In the case of $r^{\{n,m\}}$ we have the convention -that $0 \le n \le m$. The meanings of the extended regular -expressions are +\noindent In the case of $r^{\{n,m\}}$ you can assume the +convention that $0 \le n \le m$. The meanings of the extended +regular expressions are \begin{center} \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l} @@ -64,7 +65,7 @@ \noindent whereby in the last clause the set $\Sigma^*$ stands for the set of \emph{all} strings over the alphabet $\Sigma$ (in the implementation the alphabet can be just what is -represented by, say, the type \pcode{char}). So $\sim{}r$ +represented by, say, the type \pcode{Char}). So $\sim{}r$ means `all the strings that $r$ cannot match'. Be careful that your implementation of \textit{nullable} and @@ -86,12 +87,12 @@ expressions. -\subsection*{Question 1 (unmarked)} +\subsection*{Question 1} What is your King's email address (you will need it in Question 3)? -\subsection*{Question 2 (marked with 2\%)} +\subsection*{Question 2} This question does not require any implementation. From the lectures you have seen the definitions for the functions @@ -122,7 +123,7 @@ \item $L(der\,c\,r)) = Der\,c\,(L(r))$ \end{itemize} -\subsection*{Question 3 (marked with 1\%)} +\subsection*{Question 3} Implement the following regular expression for email addresses @@ -132,8 +133,8 @@ \noindent and calculate the derivative according to your email address. When calculating the derivative, simplify all regular -expressions as much as possible, but at least apply the -following six simplification rules: +expressions as much as possible by applying the +following 7 simplification rules: \begin{center} \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll} @@ -147,13 +148,12 @@ \end{tabular} \end{center} -\noindent -Write down your simplified derivative in the ``mathematicical'' -notation using parentheses where necessary. That means you should -use the more readable infix notation $+$, $\cdot$ and $^*$, +\noindent Write down your simplified derivative in a readable +notation using parentheses where necessary. That means you +should use the infix notation $+$, $\cdot$, $^*$ and so on, instead of code. -\subsection*{Question 4 (marked with 1\%)} +\subsection*{Question 4} Suppose \textit{[a-z]} stands for the range regular expression $[a,b,c,\ldots,z]$. Consider the regular expression $/ \cdot * \cdot @@ -168,7 +168,22 @@ \item \texttt{"/*test/*test*/"} \end{enumerate} -\subsection*{Question 5 (marked with 1\%)} +\noindent +Also test your regular expression matcher with the regular +expression $a^{\{3,5\}}$ and the strings + +\begin{enumerate} +\setcounter{enumi}{4} +\item \texttt{aa} +\item \texttt{aaa} +\item \texttt{aaaaa} +\item \texttt{aaaaaa} +\end{enumerate} + +\noindent +Does your matcher produce the expected results? + +\subsection*{Question 5} Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be $(a^{\{19,19\}}) \cdot (a^?)$. Decide whether the following three