coursework/cw01.tex
changeset 418 010c5a03dca2
parent 395 e57d3d92b856
child 439 7611ace6a93b
equal deleted inserted replaced
417:e74c696821a2 418:010c5a03dca2
     4 
     4 
     5 \begin{document}
     5 \begin{document}
     6 
     6 
     7 \section*{Coursework 1 (Strand 1)}
     7 \section*{Coursework 1 (Strand 1)}
     8 
     8 
     9 This coursework is worth 5\% and is due on 20 October at
     9 This coursework is worth 4\% and is due on 20 October at
    10 16:00. You are asked to implement a regular expression matcher
    10 16:00. You are asked to implement a regular expression matcher
    11 and submit a document containing the answers for the questions
    11 and submit a document containing the answers for the questions
    12 below. You can do the implementation in any programming
    12 below. You can do the implementation in any programming
    13 language you like, but you need to submit the source code with
    13 language you like, but you need to submit the source code with
    14 which you answered the questions, otherwise a mark of 0\% will
    14 which you answered the questions, otherwise a mark of 0\% will
    15 be awarded. You can submit your answers in a txt-file or pdf.
    15 be awarded. You can submit your answers in a txt-file or pdf.
       
    16 Code send as code.
    16 
    17 
    17 
    18 
    18 
    19 
    19 \subsubsection*{Disclaimer}
    20 \subsubsection*{Disclaimer}
    20 
    21 
    21 It should be understood that the work you submit represents
    22 It should be understood that the work you submit represents
    22 your own effort. You have not copied from anyone else. An
    23 your own effort. You have not copied from anyone else. An
    23 exception is the Scala code I showed during the lectures or
    24 exception is the Scala code I showed during the lectures or
    24 uploaded to KEATS, which you can use.\bigskip
    25 uploaded to KEATS, which you can freely use.\bigskip
    25 
    26 
    26 
    27 
    27 \subsubsection*{Tasks}
    28 \subsubsection*{Task}
    28 
    29 
    29 The task is to implement a regular expression matcher based on
    30 The task is to implement a regular expression matcher based on
    30 derivatives. The implementation should be able to deal with the usual
    31 derivatives of regular expressions. The implementation should
    31 (basic) regular expressions
    32 be able to deal with the usual (basic) regular expressions
    32 
    33 
    33 \[
    34 \[
    34 \varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*
    35 \ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*
    35 \]
    36 \]
    36 
    37 
    37 \noindent
    38 \noindent
    38 but also with the following extended regular expressions:
    39 but also with the following extended regular expressions:
    39 
    40 
    45 $r^{\{n,m\}}$ & at least $n$-times $r$ but no more than $m$-times\\
    46 $r^{\{n,m\}}$ & at least $n$-times $r$ but no more than $m$-times\\
    46 $\sim{}r$ & not-regular expression of $r$\\
    47 $\sim{}r$ & not-regular expression of $r$\\
    47 \end{tabular}
    48 \end{tabular}
    48 \end{center}
    49 \end{center}
    49 
    50 
    50 \noindent In the case of $r^{\{n,m\}}$ we have the convention
    51 \noindent In the case of $r^{\{n,m\}}$ you can assume the
    51 that $0 \le n \le m$. The meanings of the extended regular
    52 convention that $0 \le n \le m$. The meanings of the extended
    52 expressions are
    53 regular expressions are
    53 
    54 
    54 \begin{center}
    55 \begin{center}
    55 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}
    56 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}
    56 $L([c_1 c_2 \ldots c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ 
    57 $L([c_1 c_2 \ldots c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ 
    57 $L(r^+)$                  & $\dn$ & $\bigcup_{1\le i}. L(r)^i$\\
    58 $L(r^+)$                  & $\dn$ & $\bigcup_{1\le i}. L(r)^i$\\
    62 \end{center}
    63 \end{center}
    63 
    64 
    64 \noindent whereby in the last clause the set $\Sigma^*$ stands
    65 \noindent whereby in the last clause the set $\Sigma^*$ stands
    65 for the set of \emph{all} strings over the alphabet $\Sigma$
    66 for the set of \emph{all} strings over the alphabet $\Sigma$
    66 (in the implementation the alphabet can be just what is
    67 (in the implementation the alphabet can be just what is
    67 represented by, say, the type \pcode{char}). So $\sim{}r$
    68 represented by, say, the type \pcode{Char}). So $\sim{}r$
    68 means `all the strings that $r$ cannot match'. 
    69 means `all the strings that $r$ cannot match'. 
    69 
    70 
    70 Be careful that your implementation of \textit{nullable} and
    71 Be careful that your implementation of \textit{nullable} and
    71 \textit{der} satisfies for every $r$ the following two
    72 \textit{der} satisfies for every $r$ the following two
    72 properties (see also Question 2):
    73 properties (see also Question 2):
    84 where you are asked to explicitly give the rules for
    85 where you are asked to explicitly give the rules for
    85 \textit{nullable} and \textit{der} for the extended regular
    86 \textit{nullable} and \textit{der} for the extended regular
    86 expressions.
    87 expressions.
    87 
    88 
    88 
    89 
    89 \subsection*{Question 1 (unmarked)}
    90 \subsection*{Question 1}
    90 
    91 
    91 What is your King's email address (you will need it in
    92 What is your King's email address (you will need it in
    92 Question 3)?
    93 Question 3)?
    93 
    94 
    94 \subsection*{Question 2 (marked with 2\%)}
    95 \subsection*{Question 2}
    95 
    96 
    96 This question does not require any implementation. From the
    97 This question does not require any implementation. From the
    97 lectures you have seen the definitions for the functions
    98 lectures you have seen the definitions for the functions
    98 \textit{nullable} and \textit{der} for the basic regular
    99 \textit{nullable} and \textit{der} for the basic regular
    99 expressions. Give the rules for the extended regular
   100 expressions. Give the rules for the extended regular
   120 \begin{itemize}
   121 \begin{itemize}
   121 \item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
   122 \item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
   122 \item $L(der\,c\,r)) = Der\,c\,(L(r))$
   123 \item $L(der\,c\,r)) = Der\,c\,(L(r))$
   123 \end{itemize}
   124 \end{itemize}
   124 
   125 
   125 \subsection*{Question 3 (marked with 1\%)}
   126 \subsection*{Question 3}
   126 
   127 
   127 Implement the following regular expression for email addresses
   128 Implement the following regular expression for email addresses
   128 
   129 
   129 \[
   130 \[
   130 ([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})
   131 ([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})
   131 \]
   132 \]
   132 
   133 
   133 \noindent and calculate the derivative according to your email
   134 \noindent and calculate the derivative according to your email
   134 address. When calculating the derivative, simplify all regular
   135 address. When calculating the derivative, simplify all regular
   135 expressions as much as possible, but at least apply the
   136 expressions as much as possible by applying the
   136 following six simplification rules:
   137 following 7 simplification rules:
   137 
   138 
   138 \begin{center}
   139 \begin{center}
   139 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll}
   140 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll}
   140 $r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ 
   141 $r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ 
   141 $\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ 
   142 $\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ 
   145 $\varnothing + r$ & $\mapsto$ & $r$\\ 
   146 $\varnothing + r$ & $\mapsto$ & $r$\\ 
   146 $r + r$ & $\mapsto$ & $r$\\ 
   147 $r + r$ & $\mapsto$ & $r$\\ 
   147 \end{tabular}
   148 \end{tabular}
   148 \end{center}
   149 \end{center}
   149 
   150 
   150 \noindent
   151 \noindent Write down your simplified derivative in a readable
   151 Write down your simplified derivative in the ``mathematicical''
   152 notation using parentheses where necessary. That means you
   152 notation using parentheses where necessary. That means you should
   153 should use the infix notation $+$, $\cdot$, $^*$ and so on,
   153 use the more readable infix notation $+$, $\cdot$ and $^*$, 
       
   154 instead of code.
   154 instead of code.
   155  
   155  
   156 \subsection*{Question 4 (marked with 1\%)}
   156 \subsection*{Question 4}
   157 
   157 
   158 Suppose \textit{[a-z]} stands for the range regular expression
   158 Suppose \textit{[a-z]} stands for the range regular expression
   159 $[a,b,c,\ldots,z]$.  Consider the regular expression $/ \cdot * \cdot
   159 $[a,b,c,\ldots,z]$.  Consider the regular expression $/ \cdot * \cdot
   160 (\sim{}([a\mbox{-}z]^* \cdot * \cdot / \cdot [a\mbox{-}z]^*)) \cdot *
   160 (\sim{}([a\mbox{-}z]^* \cdot * \cdot / \cdot [a\mbox{-}z]^*)) \cdot *
   161 \cdot /$ and decide wether the following four strings are matched by
   161 \cdot /$ and decide wether the following four strings are matched by
   166 \item \texttt{"/*foobar*/"}
   166 \item \texttt{"/*foobar*/"}
   167 \item \texttt{"/*test*/test*/"}
   167 \item \texttt{"/*test*/test*/"}
   168 \item \texttt{"/*test/*test*/"}
   168 \item \texttt{"/*test/*test*/"}
   169 \end{enumerate}
   169 \end{enumerate}
   170 
   170 
   171 \subsection*{Question 5 (marked with 1\%)}
   171 \noindent
       
   172 Also test your regular expression matcher with the regular
       
   173 expression $a^{\{3,5\}}$ and the strings
       
   174 
       
   175 \begin{enumerate}
       
   176 \setcounter{enumi}{4}
       
   177 \item \texttt{aa}
       
   178 \item \texttt{aaa}
       
   179 \item \texttt{aaaaa}
       
   180 \item \texttt{aaaaaa}
       
   181 \end{enumerate}
       
   182 
       
   183 \noindent
       
   184 Does your matcher produce the expected results?
       
   185 
       
   186 \subsection*{Question 5}
   172 
   187 
   173 Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be
   188 Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be
   174 $(a^{\{19,19\}}) \cdot (a^?)$.  Decide whether the following three
   189 $(a^{\{19,19\}}) \cdot (a^?)$.  Decide whether the following three
   175 strings consisting of $a$s only can be matched by $(r_1^+)^+$.
   190 strings consisting of $a$s only can be matched by $(r_1^+)^+$.
   176 Similarly test them with $(r_2^+)^+$. Again answer in all six cases
   191 Similarly test them with $(r_2^+)^+$. Again answer in all six cases