\documentclass{article}\usepackage{../style}\usepackage{../langs}\begin{document}\section*{Coursework 1 (Strand 1)}This coursework is worth 4\% and is due on 25 October at16:00. You are asked to implement a regular expression matcherand submit a document containing the answers for the questionsbelow. You can do the implementation in any programminglanguage you like, but you need to submit the source code withwhich you answered the questions, otherwise a mark of 0\% willbe awarded. You can submit your answers in a txt-file or pdf.Code send as code.\subsubsection*{Disclaimer}It should be understood that the work you submit representsyour own effort. You have not copied from anyone else. Anexception is the Scala code I showed during the lectures oruploaded to KEATS, which you can freely use.\bigskip\subsubsection*{Task}The task is to implement a regular expression matcher based onderivatives of regular expressions. The implementation shouldbe able to deal with the usual (basic) regular expressions\[\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*\]\noindentbut also with the following extended regular expressions:\begin{center}\begin{tabular}{ll}$[c_1 c_2 \ldots c_n]$ & a range of characters\\$r^+$ & one or more times $r$\\$r^?$ & optional $r$\\$r^{\{n,m\}}$ & at least $n$-times $r$ but no more than $m$-times\\$\sim{}r$ & not-regular expression of $r$\\\end{tabular}\end{center}\noindent In the case of $r^{\{n,m\}}$ you can assume theconvention that $0 \le n \le m$. The meanings of the extendedregular expressions are\begin{center}\begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}$L([c_1 c_2 \ldots c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ $L(r^+)$ & $\dn$ & $\bigcup_{1\le i}. L(r)^i$\\$L(r^?)$ & $\dn$ & $L(r) \cup \{[]\}$\\$L(r^{\{n,m\}})$ & $\dn$ & $\bigcup_{n\le i \le m}. L(r)^i$\\$L(\sim{}r)$ & $\dn$ & $\Sigma^* - L(r)$\end{tabular}\end{center}\noindent whereby in the last clause the set $\Sigma^*$ standsfor the set of \emph{all} strings over the alphabet $\Sigma$(in the implementation the alphabet can be just what isrepresented by, say, the type \pcode{Char}). So $\sim{}r$means `all the strings that $r$ cannot match'. Be careful that your implementation of \textit{nullable} and\textit{der} satisfies for every $r$ the following twoproperties (see also Question 2):\begin{itemize}\item $\textit{nullable}(r)$ if and only if $[]\in L(r)$\item $L(der\,c\,r) = Der\,c\,(L(r))$\end{itemize}\noindent {\bf Important!} Your implementation should haveexplicit cases for the basic regular expressions, but alsoexplicit cases for the extended regular expressions. Thatmeans do not treat the extended regular expressions by justtranslating them into the basic ones. See also Question 2,where you are asked to explicitly give the rules for\textit{nullable} and \textit{der} for the extended regularexpressions.\subsection*{Question 1}What is your King's email address (you will need it inQuestion 3)?\subsection*{Question 2}From thelectures you have seen the definitions for the functions\textit{nullable} and \textit{der} for the basic regularexpressions. Implement the rules for the extended regularexpressions:\begin{center}\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}$\textit{nullable}([c_1 c_2 \ldots c_n])$ & $\dn$ & $?$\\$\textit{nullable}(r^+)$ & $\dn$ & $?$\\$\textit{nullable}(r^?)$ & $\dn$ & $?$\\$\textit{nullable}(r^{\{n,m\}})$ & $\dn$ & $?$\\$\textit{nullable}(\sim{}r)$ & $\dn$ & $?$\medskip\\$der\, c\, ([c_1 c_2 \ldots c_n])$ & $\dn$ & $?$\\$der\, c\, (r^+)$ & $\dn$ & $?$\\$der\, c\, (r^?)$ & $\dn$ & $?$\\$der\, c\, (r^{\{n,m\}})$ & $\dn$ & $?$\\$der\, c\, (\sim{}r)$ & $\dn$ & $?$\\\end{tabular}\end{center}\noindentRemember your definitions have to satisfy the two properties\begin{itemize}\item $\textit{nullable}(r)$ if and only if $[]\in L(r)$\item $L(der\,c\,r)) = Der\,c\,(L(r))$\end{itemize}\noindentGive the implementation and the text-version of the clauses above.\subsection*{Question 3}Implement the following regular expression for email addresses\[([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})\]\noindent and calculate the derivative according to your emailaddress. When calculating the derivative, simplify all regularexpressions as much as possible by applying thefollowing 7 simplification rules:\begin{center}\begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll}$r \cdot \ZERO$ & $\mapsto$ & $\ZERO$\\ $\ZERO \cdot r$ & $\mapsto$ & $\ZERO$\\ $r \cdot \ONE$ & $\mapsto$ & $r$\\ $\ONE \cdot r$ & $\mapsto$ & $r$\\ $r + \ZERO$ & $\mapsto$ & $r$\\ $\ZERO + r$ & $\mapsto$ & $r$\\ $r + r$ & $\mapsto$ & $r$\\ \end{tabular}\end{center}\noindent Write down your simplified derivative in a readablenotation using parentheses where necessary. That means youshould use the infix notation $+$, $\cdot$, $^*$ and so on,instead of code.\subsection*{Question 4}Suppose \textit{[a-z]} stands for the range regular expression$[a,b,c,\ldots,z]$. Consider the regular expression $/ \cdot * \cdot(\sim{}([a\mbox{-}z]^* \cdot * \cdot / \cdot [a\mbox{-}z]^*)) \cdot *\cdot /$ and decide wether the following four strings are matched bythis regular expression. Answer yes or no.\begin{enumerate}\item \texttt{"/**/"}\item \texttt{"/*foobar*/"}\item \texttt{"/*test*/test*/"}\item \texttt{"/*test/*test*/"}\end{enumerate}\noindentAlso test your regular expression matcher with the regularexpressions $a^{\{3,5\}}$ and $(a^?)^{\{3,5\}}$. Test whether thestrings\begin{enumerate}\setcounter{enumi}{4}\item \texttt{aa}\item \texttt{aaa}\item \texttt{aaaaa}\item \texttt{aaaaaa}\end{enumerate}\noindentare matched or not. Does your matcher produce the expected results?\subsection*{Question 5}Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be$(a^{\{19,19\}}) \cdot (a^?)$. Decide whether the following threestrings consisting of $a$s only can be matched by $(r_1^+)^+$.Similarly test them with $(r_2^+)^+$. Again answer in all six caseswith yes or no. \medskip\noindentThese are strings are meant to be entirely made up of $a$s. Be carefulwhen copy-and-pasting the strings so as to not forgetting any $a$ andto not introducing any other character.\begin{enumerate}\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}\end{enumerate}\end{document}%%% Local Variables: %%% mode: latex%%% TeX-master: t%%% End: