cws/cw03.tex
changeset 6 aae256985251
child 62 2151c77e1e24
equal deleted inserted replaced
5:c1e6123d02f4 6:aae256985251
       
     1 \documentclass{article}
       
     2 \usepackage{style}
       
     3 %%\usepackage{../langs}
       
     4 
       
     5 \begin{document}
       
     6 
       
     7 \section*{Coursework 3}
       
     8 
       
     9 This coursework is worth XXX\% and is due on XXXX at
       
    10 16:00. You are asked to implement a regular expression matcher.
       
    11 
       
    12 
       
    13 
       
    14 
       
    15 \subsubsection*{Disclaimer}
       
    16 
       
    17 It should be understood that the work you submit represents
       
    18 your own effort. You have not copied from anyone else. An
       
    19 exception is the Scala code I showed during the lectures or
       
    20 uploaded to KEATS, which you can freely use.\bigskip
       
    21 
       
    22 
       
    23 \subsubsection*{Task}
       
    24 
       
    25 The task is to implement a regular expression matcher based on
       
    26 derivatives of regular expressions. The implementation should
       
    27 be able to deal with the usual (basic) regular expressions
       
    28 
       
    29 \begin{center}
       
    30 \begin{tabular}{lcll}
       
    31   $r$ & $::=$ & $\ZERO$     & cannot match anything\\
       
    32       &   $|$ & $\ONE$      & can only match the empty string\\
       
    33       &   $|$ & $c$         & can match a character $c$\\
       
    34       &   $|$ & $r_1 + r_2$ & can match either with $r_1$ or with $r_2$\\
       
    35       &   $|$ & $r_1 \cdot r_2$ & can match first with $r_1$ and then with $r_2$\\
       
    36       &   $|$ & $r^*$       & can match zero or more times $r$\\
       
    37       &   $|$ & $r^{\{\uparrow n\}}$ & can match zero upto $n$ times $r$\\
       
    38       &   $|$ & $r^{\{n\}}$ & can match exactly $n$ times $r$\\
       
    39 \end{tabular}
       
    40 \end{center}
       
    41 
       
    42 \noindent
       
    43 Implement a function called \textit{nullable} by recursion over
       
    44 regular expressions:
       
    45 
       
    46 \begin{center}
       
    47 \begin{tabular}{lcl}
       
    48 $\textit{nullable}(\ZERO)$ & $\dn$ & $\textit{false}$\\
       
    49 $\textit{nullable}(\ONE)$  & $\dn$ & $\textit{true}$\\
       
    50 $\textit{nullable}(c)$     & $\dn$ & $\textit{false}$\\
       
    51 $\textit{nullable}(r_1 + r_2)$ & $\dn$ & $\textit{nullable}(r_1) \vee \textit{nullable}(r_2)$\\
       
    52 $\textit{nullable}(r_1 \cdot r_2)$ & $\dn$ & $\textit{nullable}(r_1) \wedge \textit{nullable}(r_2)$\\
       
    53 $\textit{nullable}(r^*)$ & $\dn$ & $\textit{true}$\\
       
    54 $\textit{nullable}(r^{\{\uparrow n\}})$ & $\dn$ & $\textit{true}$\\
       
    55 $\textit{nullable}(r^{\{n\}})$ & $\dn$ & 
       
    56    $\textit{if}\;n = 0\; \textit{then} \; \textit{true} \; \textit{else} \; \textit{nullable}(r)$\\
       
    57 \end{tabular}
       
    58 \end{center}
       
    59 
       
    60 \begin{center}
       
    61 \begin{tabular}{lcl}
       
    62 $\textit{der}\;c\;(\ZERO)$ & $\dn$ & $\ZERO$\\
       
    63 $\textit{der}\;c\;(\ONE)$  & $\dn$ & $\ZERO$\\
       
    64 $\textit{der}\;c\;(d)$     & $\dn$ & $\textit{if}\; c = d\;\textit{then} \;\ONE \; \textit{else} \;\ZERO$\\
       
    65 $\textit{der}\;c\;(r_1 + r_2)$ & $\dn$ & $(\textit{der}\;c\;r_1) + (\textit{der}\;c\;r_2)$\\
       
    66 $\textit{der}\;c\;(r_1 \cdot r_2)$ & $\dn$ & $\textit{if}\;\textit{nullable}(r_1)$\\
       
    67       & & $\textit{then}\;((\textit{der}\;c\;r_1)\cdot r_2) + (\textit{der}\;c\;r_2)$\\
       
    68       & & $\textit{else}\;(\textit{der}\;c\;r_1)\cdot r_2$\\
       
    69 $\textit{der}\;c\;(r^*)$ & $\dn$ & $(\textit{der}\;c\;r)\cdot (r^*)$\\
       
    70 $\textit{der}\;c\;(r^{\{\uparrow n\}})$ & $\dn$ & $\textit{if}\;n = 0\;\textit{then}\;\ZERO\;\text{else}\;
       
    71    (\textit{der}\;c\;r)\cdot (r^{\{\uparrow n-1\}})$\\
       
    72 $\textit{der}\;c\;(r^{\{n\}})$ & $\dn$ & 
       
    73    $\textit{if}\;n = 0\; \textit{then} \; \ZERO\; \textit{else}\;$\\
       
    74    & & $\textit{if} \;\textit{nullable}(r)\;\textit{then}\;(\textit{der}\;c\;r)\cdot (r^{\{\uparrow n-1\}})$\\
       
    75    & & $\textit{else}\;(\textit{der}\;c\;r)\cdot (r^{\{n-1\}})$
       
    76 \end{tabular}
       
    77 \end{center}
       
    78 
       
    79 
       
    80 Be careful that your implementation of \textit{nullable} and
       
    81 \textit{der}\;c\; satisfies for every $r$ the following two
       
    82 properties (see also Question 2):
       
    83 
       
    84 \begin{itemize}
       
    85 \item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
       
    86 \item $L(der\,c\,r) = Der\,c\,(L(r))$
       
    87 \end{itemize}
       
    88 
       
    89 \noindent {\bf Important!} Your implementation should have
       
    90 explicit cases for the basic regular expressions, but also
       
    91 explicit cases for the extended regular expressions. That
       
    92 means do not treat the extended regular expressions by just
       
    93 translating them into the basic ones. See also Question 2,
       
    94 where you are asked to explicitly give the rules for
       
    95 \textit{nullable} and \textit{der}\;c\; for the extended regular
       
    96 expressions.
       
    97 
       
    98 
       
    99 \subsection*{Question 1}
       
   100 
       
   101 What is your King's email address (you will need it in
       
   102 Question 3)?
       
   103 
       
   104 \subsection*{Question 2}
       
   105 
       
   106 This question does not require any implementation. From the
       
   107 lectures you have seen the definitions for the functions
       
   108 \textit{nullable} and \textit{der}\;c\; for the basic regular
       
   109 expressions. Give the rules for the extended regular
       
   110 expressions:
       
   111 
       
   112 \begin{center}
       
   113 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
       
   114 $\textit{nullable}([c_1 c_2 \ldots c_n])$  & $\dn$ & $?$\\
       
   115 $\textit{nullable}(r^+)$                   & $\dn$ & $?$\\
       
   116 $\textit{nullable}(r^?)$                   & $\dn$ & $?$\\
       
   117 $\textit{nullable}(r^{\{n,m\}})$            & $\dn$ & $?$\\
       
   118 $\textit{nullable}(\sim{}r)$               & $\dn$ & $?$\medskip\\
       
   119 $der\, c\, ([c_1 c_2 \ldots c_n])$  & $\dn$ & $?$\\
       
   120 $der\, c\, (r^+)$                   & $\dn$ & $?$\\
       
   121 $der\, c\, (r^?)$                   & $\dn$ & $?$\\
       
   122 $der\, c\, (r^{\{n,m\}})$            & $\dn$ & $?$\\
       
   123 $der\, c\, (\sim{}r)$               & $\dn$ & $?$\\
       
   124 \end{tabular}
       
   125 \end{center}
       
   126 
       
   127 \noindent
       
   128 Remember your definitions have to satisfy the two properties
       
   129 
       
   130 \begin{itemize}
       
   131 \item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
       
   132 \item $L(der\,c\,r)) = Der\,c\,(L(r))$
       
   133 \end{itemize}
       
   134 
       
   135 \subsection*{Question 3}
       
   136 
       
   137 Implement the following regular expression for email addresses
       
   138 
       
   139 \[
       
   140 ([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})
       
   141 \]
       
   142 
       
   143 \noindent and calculate the derivative according to your email
       
   144 address. When calculating the derivative, simplify all regular
       
   145 expressions as much as possible by applying the
       
   146 following 7 simplification rules:
       
   147 
       
   148 \begin{center}
       
   149 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll}
       
   150 $r \cdot \ZERO$ & $\mapsto$ & $\ZERO$\\ 
       
   151 $\ZERO \cdot r$ & $\mapsto$ & $\ZERO$\\ 
       
   152 $r \cdot \ONE$ & $\mapsto$ & $r$\\ 
       
   153 $\ONE \cdot r$ & $\mapsto$ & $r$\\ 
       
   154 $r + \ZERO$ & $\mapsto$ & $r$\\ 
       
   155 $\ZERO + r$ & $\mapsto$ & $r$\\ 
       
   156 $r + r$ & $\mapsto$ & $r$\\ 
       
   157 \end{tabular}
       
   158 \end{center}
       
   159 
       
   160 \noindent Write down your simplified derivative in a readable
       
   161 notation using parentheses where necessary. That means you
       
   162 should use the infix notation $+$, $\cdot$, $^*$ and so on,
       
   163 instead of code.
       
   164  
       
   165 \subsection*{Question 4}
       
   166 
       
   167 Suppose \textit{[a-z]} stands for the range regular expression
       
   168 $[a,b,c,\ldots,z]$.  Consider the regular expression $/ \cdot * \cdot
       
   169 (\sim{}([a\mbox{-}z]^* \cdot * \cdot / \cdot [a\mbox{-}z]^*)) \cdot *
       
   170 \cdot /$ and decide wether the following four strings are matched by
       
   171 this regular expression. Answer yes or no.
       
   172 
       
   173 \begin{enumerate}
       
   174 \item \texttt{"/**/"}
       
   175 \item \texttt{"/*foobar*/"}
       
   176 \item \texttt{"/*test*/test*/"}
       
   177 \item \texttt{"/*test/*test*/"}
       
   178 \end{enumerate}
       
   179 
       
   180 \noindent
       
   181 Also test your regular expression matcher with the regular
       
   182 expression $a^{\{3,5\}}$ and the strings
       
   183 
       
   184 \begin{enumerate}
       
   185 \setcounter{enumi}{4}
       
   186 \item \texttt{aa}
       
   187 \item \texttt{aaa}
       
   188 \item \texttt{aaaaa}
       
   189 \item \texttt{aaaaaa}
       
   190 \end{enumerate}
       
   191 
       
   192 \noindent
       
   193 Does your matcher produce the expected results?
       
   194 
       
   195 \subsection*{Question 5}
       
   196 
       
   197 Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be
       
   198 $(a^{\{19,19\}}) \cdot (a^?)$.  Decide whether the following three
       
   199 strings consisting of $a$s only can be matched by $(r_1^+)^+$.
       
   200 Similarly test them with $(r_2^+)^+$. Again answer in all six cases
       
   201 with yes or no. \medskip
       
   202 
       
   203 \noindent
       
   204 These are strings are meant to be entirely made up of $a$s. Be careful
       
   205 when copy-and-pasting the strings so as to not forgetting any $a$ and
       
   206 to not introducing any other character.
       
   207 
       
   208 \begin{enumerate}
       
   209 \item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
       
   210 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
       
   211 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
       
   212 \item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
       
   213 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
       
   214 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
       
   215 \item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
       
   216 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
       
   217 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
       
   218 \end{enumerate}
       
   219 
       
   220 
       
   221 \end{document}
       
   222 
       
   223 %%% Local Variables: 
       
   224 %%% mode: latex
       
   225 %%% TeX-master: t
       
   226 %%% End: