coursework/cw01.tex
changeset 259 e5f4b8ff23b8
parent 253 75c469893514
child 260 65d1ea0e989f
equal deleted inserted replaced
258:1e4da6d2490c 259:e5f4b8ff23b8
     4 
     4 
     5 \begin{document}
     5 \begin{document}
     6 
     6 
     7 \section*{Coursework 1}
     7 \section*{Coursework 1}
     8 
     8 
     9 This coursework is worth 3\% and is due on 12 November at 16:00. You are asked to implement 
     9 This coursework is worth 5\% and is due on 13 October at 16:00. You
    10 a regular expression matcher and submit a document containing the answers for the questions 
    10 are asked to implement a regular expression matcher and submit a
    11 below. You can do the implementation in any programming language you like, but you need 
    11 document containing the answers for the questions below. You can do
    12 to submit the source code with which you answered the questions. However, the coursework 
    12 the implementation in any programming language you like, but you need
    13 will \emph{only} be judged according to the answers. You can submit your answers
    13 to submit the source code with which you answered the
    14 in a txt-file or pdf.\bigskip
    14 questions. However, the coursework will \emph{only} be judged
       
    15 according to the answers. You can submit your answers in a txt-file or
       
    16 pdf.
    15 
    17 
    16 \noindent
    18 \subsubsection*{Disclaimer}
    17 The task is to implement a regular expression matcher based on derivatives. The implementation 
    19 
    18 should be able to deal with the usual regular expressions
    20 It should be understood that the work submitted represents your own effort. 
       
    21 You have not copied from anyone else. An exception is the Scala code I
       
    22 showed during the lectures, which you can use.\bigskip
       
    23 
       
    24 
       
    25 \subsubsection*{Tasks}
       
    26 
       
    27 The task is to implement a regular expression matcher based on
       
    28 derivatives. The implementation should be able to deal with the usual
       
    29 (basic) regular expressions
    19 
    30 
    20 \[
    31 \[
    21 \varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*
    32 \varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*
    22 \]
    33 \]
    23 
    34 
    24 \noindent
    35 \noindent
    25 but also with
    36 but also with the following extended regular expressions:
    26 
    37 
    27 \begin{center}
    38 \begin{center}
    28 \begin{tabular}{ll}
    39 \begin{tabular}{ll}
    29 $[c_1 c_2 \ldots c_n]$ & a range of characters\\
    40 $[c_1 c_2 \ldots c_n]$ & a range of characters\\
    30 $r^+$ & one or more times $r$\\
    41 $r^+$ & one or more times $r$\\
    38 In the case of $r^{\{n,m\}}$ we have the convention that $0 \le n \le m$.
    49 In the case of $r^{\{n,m\}}$ we have the convention that $0 \le n \le m$.
    39 The meaning of these regular expressions is
    50 The meaning of these regular expressions is
    40 
    51 
    41 \begin{center}
    52 \begin{center}
    42 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}
    53 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}
    43 $L([c_1 c_2 \ldots c_n])$ & $\dn$ & $\{"c_1", "c_2", \ldots, "c_n"\}$\\ 
    54 $L([c_1 c_2 \ldots c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ 
    44 $L(r^+)$            & $\dn$ & $\bigcup_{1\le i}. L(r)^i$\\
    55 $L(r^+)$                  & $\dn$ & $\bigcup_{1\le i}. L(r)^i$\\
    45 $L(r^?)$            & $\dn$ & $L(r) \cup \{""\}$\\
    56 $L(r^?)$                  & $\dn$ & $L(r) \cup \{[]\}$\\
    46 $L(r^{\{n,m\}})$ & $\dn$ & $\bigcup_{n\le i \le m}. L(r)^i$\\
    57 $L(r^{\{n,m\}})$           & $\dn$ & $\bigcup_{n\le i \le m}. L(r)^i$\\
    47 $L(\sim{}r)$       & $\dn$ & $UNIV - L(r)$
    58 $L(\sim{}r)$              & $\dn$ & $\mathbb{A} - L(r)$
    48 \end{tabular}
    59 \end{tabular}
    49 \end{center}
    60 \end{center}
    50 
    61 
    51 \noindent
    62 \noindent
    52 whereby in the last clause the set $UNIV$ stands for the set of \emph{all} strings.
    63 whereby in the last clause the set $\mathbb{A}$ stands for the set of
    53 So $\sim{}r$ means `all the strings that $r$ cannot match'. We assume ranges 
    64 \emph{all} strings.  So $\sim{}r$ means `all the strings that $r$
    54 like $[a\mbox{-}z0\mbox{-}9]$ are a shorthand for the regular expression
    65 cannot match'. We assume ranges like $[a\mbox{-}z0\mbox{-}9]$ are a
       
    66 shorthand for the regular expression
    55 
    67 
    56 \[
    68 \[
    57 [a b c d\ldots z 0 1\ldots 9]\;.
    69 [a b c d\ldots z 0 1\ldots 9]\;.
    58 \]
    70 \]
    59 
    71 
    60 \noindent 
    72 \noindent 
    61 Be careful that your implementation of $nullable$ and $der$ satisfies for every $r$ the following two
    73 Be careful that your implementation of $nullable$ and $der$ satisfies
    62 properties:
    74 for every $r$ the following two properties:
    63 
    75 
    64 \begin{itemize}
    76 \begin{itemize}
    65 \item $nullable(r)$ if and only if $""\in L(r)$
    77 \item $nullable(r)$ if and only if $[]\in L(r)$
    66 \item $L(der\,c\,r)) = Der\,c\,(L(r))$
    78 \item $L(der\,c\,r)) = Der\,c\,(L(r))$
    67 \end{itemize}
    79 \end{itemize}
    68 \newpage
    80 
       
    81 \noindent
       
    82 {\bf Important!} Your implementation should have explicit cases for the
       
    83 basic regular expressions, but also for the extended regular expressions.
       
    84 That means do not treat the extended regular expressions by just translating 
       
    85 them into the basic ones. See also Question 2, where you asked to give
       
    86 the rules for \textit{nullable} and \textit{der}.
       
    87 
    69 
    88 
    70 \subsection*{Question 1 (unmarked)}
    89 \subsection*{Question 1 (unmarked)}
    71 
    90 
    72 What is your King's email address (you will need it in the next question)?\bigskip 
    91 What is your King's email address (you will need it in Question 2)?
    73 
    92 
    74 \subsection*{Question 2 (marked with 1\%)}
    93 \subsection*{Question 2 (marked with 2\%)}
       
    94 
       
    95 This question does not require any implementation. From the lectures
       
    96 you have seen the definitions for the functions \textit{nullable} and
       
    97 \textit{der}.  Give the rules for the extended regular expressions:
       
    98 
       
    99 \begin{center}
       
   100 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
       
   101 $nullable([c_1 c_2 \ldots c_n])$  & $\dn$ & $?$\\
       
   102 $nullable(r^+)$                   & $\dn$ & $?$\\
       
   103 $nullable(r^?)$                   & $\dn$ & $?$\\
       
   104 $nullable(r^{\{n,m\}})$            & $\dn$ & $?$\\
       
   105 $nullable(\sim{}r)$               & $\dn$ & $?$\medskip\\
       
   106 $der c ([c_1 c_2 \ldots c_n])$  & $\dn$ & $?$\\
       
   107 $der c (r^+)$                   & $\dn$ & $?$\\
       
   108 $der c (r^?)$                   & $\dn$ & $?$\\
       
   109 $der c (r^{\{n,m\}})$            & $\dn$ & $?$\\
       
   110 $der c (\sim{}r)$               & $\dn$ & $?$\\
       
   111 \end{tabular}
       
   112 \end{center}
       
   113 
       
   114 \subsection*{Question 3 (marked with 1\%)}
    75 
   115 
    76 Implement the following regular expression for email addresses
   116 Implement the following regular expression for email addresses
    77 
   117 
    78 \[
   118 \[
    79 ([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})
   119 ([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})
    80 \]
   120 \]
    81 
   121 
    82 \noindent
   122 \noindent
    83 and calculate the derivative according to your email address. When calculating
   123 and calculate the derivative according to your email address. When
    84 the derivative, simplify all regular expressions as much as possible, but at least apply the following 
   124 calculating the derivative, simplify all regular expressions as much
    85 six simplification rules:
   125 as possible, but at least apply the following six simplification
       
   126 rules:
    86 
   127 
    87 \begin{center}
   128 \begin{center}
    88 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}l}
   129 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}l}
    89 $r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ 
   130 $r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ 
    90 $\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ 
   131 $\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ 
    94 $\varnothing + r$ & $\mapsto$ & $r$\\ 
   135 $\varnothing + r$ & $\mapsto$ & $r$\\ 
    95 \end{tabular}
   136 \end{tabular}
    96 \end{center}
   137 \end{center}
    97 
   138 
    98 \noindent
   139 \noindent
    99 Write down your simplified derivative in the ``mathematicical'' notation using parentheses where necessary.
   140 Write down your simplified derivative in the ``mathematicical''
       
   141 notation using parentheses where necessary.
   100 
   142 
   101 \subsection*{Question 3 (marked with 1\%)}
   143 \subsection*{Question 4 (marked with 1\%)}
   102 
   144 
   103 Consider the regular expression $/ \cdot * \cdot (\sim{}([a\mbox{-}z]^* \cdot * \cdot / \cdot [a\mbox{-}z]^*)) \cdot * \cdot /$ and decide
   145 Consider the regular expression $/ \cdot * \cdot
   104 wether the following four strings are matched by this regular expression. Answer yes or no.
   146 (\sim{}([a\mbox{-}z]^* \cdot * \cdot / \cdot [a\mbox{-}z]^*)) \cdot *
       
   147 \cdot /$ and decide wether the following four strings are matched by
       
   148 this regular expression. Answer yes or no.
   105 
   149 
   106 \begin{enumerate}
   150 \begin{enumerate}
   107 \item \texttt{"/**/"}
   151 \item \texttt{"/**/"}
   108 \item \texttt{"/*foobar*/"}
   152 \item \texttt{"/*foobar*/"}
   109 \item \texttt{"/*test*/test*/"}
   153 \item \texttt{"/*test*/test*/"}
   110 \item \texttt{"/*test/*test*/"}
   154 \item \texttt{"/*test/*test*/"}
   111 \end{enumerate}
   155 \end{enumerate}
   112 
   156 
   113 \subsection*{Question 4 (marked with 1\%)}
   157 \subsection*{Question 5 (marked with 1\%)}
   114 
   158 
   115 Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be $(a^{\{19,19\}}) \cdot (a^?)$.
   159 Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be
   116 Decide whether the following three strings consisting of $a$s only can be matched by $(r_1^+)^+$. 
   160 $(a^{\{19,19\}}) \cdot (a^?)$.  Decide whether the following three
   117 Similarly test them with $(r_2^+)^+$. Again answer in all six cases with yes or no. \medskip
   161 strings consisting of $a$s only can be matched by $(r_1^+)^+$.
       
   162 Similarly test them with $(r_2^+)^+$. Again answer in all six cases
       
   163 with yes or no. \medskip
   118 
   164 
   119 \noindent
   165 \noindent
   120 These are strings are meant to be entirely made up of $a$s. Be careful when 
   166 These are strings are meant to be entirely made up of $a$s. Be careful
   121 copy-and-pasting the strings so as to not forgetting any $a$ and to not introducing any
   167 when copy-and-pasting the strings so as to not forgetting any $a$ and
   122 other character.
   168 to not introducing any other character.
   123 
   169 
   124 \begin{enumerate}
   170 \begin{enumerate}
   125 \item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
   171 \item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
   126 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
   172 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
   127 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
   173 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}