\documentclass{article}\usepackage{../style}\usepackage{../langs}\begin{document}\section*{Coursework 1 (Strand 1)}This coursework is worth 5\% and is due on 16 October at 16:00. Youare asked to implement a regular expression matcher and submit adocument containing the answers for the questions below. You can dothe implementation in any programming language you like, but you needto submit the source code with which you answered thequestions. However, the coursework will \emph{only} be judgedaccording to the answers. You can submit your answers in a txt-file orpdf.\subsubsection*{Disclaimer}It should be understood that the work you submit represents your owneffort. You have not copied from anyone else. An exception is theScala code I showed during the lectures, which you can use.\bigskip\subsubsection*{Tasks}The task is to implement a regular expression matcher based onderivatives. The implementation should be able to deal with the usual(basic) regular expressions\[\varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*\]\noindentbut also with the following extended regular expressions:\begin{center}\begin{tabular}{ll}$[c_1 c_2 \ldots c_n]$ & a range of characters\\$r^+$ & one or more times $r$\\$r^?$ & optional $r$\\$r^{\{n,m\}}$ & at least $n$-times $r$ but no more than $m$-times\\$\sim{}r$ & not-regular expression of $r$\\\end{tabular}\end{center}\noindentIn the case of $r^{\{n,m\}}$ we have the convention that $0 \le n \le m$.The meaning of these regular expressions is\begin{center}\begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}$L([c_1 c_2 \ldots c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ $L(r^+)$ & $\dn$ & $\bigcup_{1\le i}. L(r)^i$\\$L(r^?)$ & $\dn$ & $L(r) \cup \{[]\}$\\$L(r^{\{n,m\}})$ & $\dn$ & $\bigcup_{n\le i \le m}. L(r)^i$\\$L(\sim{}r)$ & $\dn$ & $\Sigma^* - L(r)$\end{tabular}\end{center}\noindentwhereby in the last clause the set $\Sigma^*$ stands for the set of\emph{all} strings over the alphabet $\Sigma$ (in the implementation thealphabet can be just what is represented by, say, the type \pcode{char})). So $\sim{}r$ means `all the strings that $r$cannot match'. Be careful that your implementation of $nullable$ and $der$ satisfiesfor every $r$ the following two properties (see also Question 2):\begin{itemize}\item $nullable(r)$ if and only if $[]\in L(r)$\item $L(der\,c\,r)) = Der\,c\,(L(r))$\end{itemize}\noindent{\bf Important!} Your implementation should have explicit cases forthe basic regular expressions, but also explicit cases for theextended regular expressions. That means do not treat the extendedregular expressions by just translating them into the basic ones. Seealso Question 2, where you are asked to explicitly give the rules for\textit{nullable} and \textit{der} for the extended regularexpressions.\subsection*{Question 1 (unmarked)}What is your King's email address (you will need it in Question 3)?\subsection*{Question 2 (marked with 2\%)}This question does not require any implementation. From the lecturesyou have seen the definitions for the functions \textit{nullable} and\textit{der} for the basic regular expressions. Give the rules forthe extended regular expressions:\begin{center}\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}$nullable([c_1 c_2 \ldots c_n])$ & $\dn$ & $?$\\$nullable(r^+)$ & $\dn$ & $?$\\$nullable(r^?)$ & $\dn$ & $?$\\$nullable(r^{\{n,m\}})$ & $\dn$ & $?$\\$nullable(\sim{}r)$ & $\dn$ & $?$\medskip\\$der\, c\, ([c_1 c_2 \ldots c_n])$ & $\dn$ & $?$\\$der\, c\, (r^+)$ & $\dn$ & $?$\\$der\, c\, (r^?)$ & $\dn$ & $?$\\$der\, c\, (r^{\{n,m\}})$ & $\dn$ & $?$\\$der\, c\, (\sim{}r)$ & $\dn$ & $?$\\\end{tabular}\end{center}\noindentRemember your definitions have to satisfy the two properties\begin{itemize}\item $nullable(r)$ if and only if $[]\in L(r)$\item $L(der\,c\,r)) = Der\,c\,(L(r))$\end{itemize}\subsection*{Question 3 (marked with 1\%)}Implement the following regular expression for email addresses\[([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})\]\noindentand calculate the derivative according to your email address. Whencalculating the derivative, simplify all regular expressions as muchas possible, but at least apply the following six simplificationrules:\begin{center}\begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll}$r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ $\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ $r \cdot \epsilon$ & $\mapsto$ & $r$\\ $\epsilon \cdot r$ & $\mapsto$ & $r$\\ $r + \varnothing$ & $\mapsto$ & $r$\\ $\varnothing + r$ & $\mapsto$ & $r$\\ $r + r$ & $\mapsto$ & $r$\\ \end{tabular}\end{center}\noindentWrite down your simplified derivative in the ``mathematicical''notation using parentheses where necessary.\subsection*{Question 4 (marked with 1\%)}Suppose \textit{[a-z]} stands for the range regular expression$[a,b,c,\ldots,z]$. Consider the regular expression $/ \cdot * \cdot(\sim{}([a\mbox{-}z]^* \cdot * \cdot / \cdot [a\mbox{-}z]^*)) \cdot *\cdot /$ and decide wether the following four strings are matched bythis regular expression. Answer yes or no.\begin{enumerate}\item \texttt{"/**/"}\item \texttt{"/*foobar*/"}\item \texttt{"/*test*/test*/"}\item \texttt{"/*test/*test*/"}\end{enumerate}\subsection*{Question 5 (marked with 1\%)}Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be$(a^{\{19,19\}}) \cdot (a^?)$. Decide whether the following threestrings consisting of $a$s only can be matched by $(r_1^+)^+$.Similarly test them with $(r_2^+)^+$. Again answer in all six caseswith yes or no. \medskip\noindentThese are strings are meant to be entirely made up of $a$s. Be carefulwhen copy-and-pasting the strings so as to not forgetting any $a$ andto not introducing any other character.\begin{enumerate}\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}\end{enumerate}\end{document}%%% Local Variables: %%% mode: latex%%% TeX-master: t%%% End: