9 \newcolumntype{C}[1]{>{\centering}m{#1}} |
9 \newcolumntype{C}[1]{>{\centering}m{#1}} |
10 |
10 |
11 \section*{Coursework 1 (Strand 1)} |
11 \section*{Coursework 1 (Strand 1)} |
12 |
12 |
13 This coursework is worth 4\% and is due on 12 October at |
13 This coursework is worth 4\% and is due on 12 October at |
14 16:00. You are asked to implement a regular expression matcher |
14 18:00. You are asked to implement a regular expression matcher |
15 and submit a document containing the answers for the questions |
15 and submit a document containing the answers for the questions |
16 below. You can do the implementation in any programming |
16 below. You can do the implementation in any programming |
17 language you like, but you need to submit the source code with |
17 language you like, but you need to submit the source code with |
18 which you answered the questions, otherwise a mark of 0\% will |
18 which you answered the questions, otherwise a mark of 0\% will |
19 be awarded. You can submit your answers in a txt-file or pdf. |
19 be awarded. You can submit your answers in a txt-file or pdf. |
60 |
60 |
61 \noindent You can assume that $n$ and $m$ are greater or equal than |
61 \noindent You can assume that $n$ and $m$ are greater or equal than |
62 $0$. In the case of $r^{\{n,m\}}$ you can also assume $0 \le n \le m$.\bigskip |
62 $0$. In the case of $r^{\{n,m\}}$ you can also assume $0 \le n \le m$.\bigskip |
63 |
63 |
64 \noindent {\bf Important!} Your implementation should have explicit |
64 \noindent {\bf Important!} Your implementation should have explicit |
65 cases for the basic regular expressions, but also for explicit cases for |
65 case classes for the basic regular expressions, but also explicit case |
66 the extended regular expressions. That means do not treat the extended |
66 classes for |
67 regular expressions by just translating them into the basic ones. See |
67 the extended regular expressions.\footnote{Please call them |
68 also Question 3, where you are asked to explicitly give the rules for |
68 \code{RANGE}, \code{PLUS}, \code{OPTIONAL}, \code{NTIMES}, |
69 \textit{nullable} and \textit{der} for the extended regular |
69 \code{UPTO}, \code{FROM}, \code{BETWEEN}, \code{NOT} or something |
70 expressions.\newpage |
70 like that.} That means do not treat the extended regular expressions |
|
71 by just translating them into the basic ones. See also Question 3, |
|
72 where you are asked to explicitly give the rules for \textit{nullable} |
|
73 and \textit{der} for the extended regular expressions.\medskip |
71 |
74 |
72 \noindent |
75 \noindent |
73 The meanings of the extended regular expressions are |
76 The meanings of the extended regular expressions are |
74 |
77 |
75 \begin{center} |
78 \begin{center} |
108 What is your King's email address (you will need it in |
111 What is your King's email address (you will need it in |
109 Question 5)? |
112 Question 5)? |
110 |
113 |
111 \subsection*{Question 2 (Unmarked)} |
114 \subsection*{Question 2 (Unmarked)} |
112 |
115 |
113 In which programming languages have you written a program (like spent |
116 Can you please list all programming languages in which you have |
114 at least a day working on the program)? |
117 already written programs (like spent at least a good working day |
|
118 working on the program)? This is just for my curiosity to estimate |
|
119 what your background is. |
115 |
120 |
116 \subsection*{Question 3} |
121 \subsection*{Question 3} |
117 |
122 |
118 From the |
123 From the |
119 lectures you have seen the definitions for the functions |
124 lectures you have seen the definitions for the functions |
120 \textit{nullable} and \textit{der} for the basic regular |
125 \textit{nullable} and \textit{der} for the basic regular |
121 expressions. Implement the rules for the extended regular |
126 expressions. Implement and write down rules for the extended |
122 expressions: |
127 regular expressions: |
123 |
128 |
124 \begin{center} |
129 \begin{center} |
125 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} |
130 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} |
126 $\textit{nullable}([c_1,c_2,\ldots,c_n])$ & $\dn$ & $?$\\ |
131 $\textit{nullable}([c_1,c_2,\ldots,c_n])$ & $\dn$ & $?$\\ |
127 $\textit{nullable}(r^+)$ & $\dn$ & $?$\\ |
132 $\textit{nullable}(r^+)$ & $\dn$ & $?$\\ |
190 $\textit{ALL}$ & matches any character |
195 $\textit{ALL}$ & matches any character |
191 \end{tabular} |
196 \end{tabular} |
192 \end{center} |
197 \end{center} |
193 |
198 |
194 \noindent |
199 \noindent |
195 the latter is useful for matching any string (for example |
200 The latter is useful for matching any string (for example |
196 by using $\textit{ALL}^*$). In order to avoid having an explicit constructor |
201 by using $\textit{ALL}^*$). In order to avoid having an explicit constructor |
197 for each case, we can generalise all these cases and introduce a single |
202 for each case, we can generalise all these cases and introduce a single |
198 constructor $\textit{CFUN}(f)$ where $f$ is a function from characters |
203 constructor $\textit{CFUN}(f)$ where $f$ is a function from characters |
199 to a boolean. The idea is that the function $f$ determines which character(s) |
204 to booleans. The idea is that the function $f$ determines which character(s) |
200 are matched, namely those where $f$ returns \texttt{true}. |
205 are matched, namely those where $f$ returns \texttt{true}. |
201 In this question implement \textit{CFUN} and define |
206 In this question implement \textit{CFUN} and define |
202 |
207 |
203 \begin{center} |
208 \begin{center} |
204 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} |
209 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} |
215 $[c_1,c_2,\ldots,c_n]$ & $\dn$ & $\textit{CFUN}(?)$\\ |
220 $[c_1,c_2,\ldots,c_n]$ & $\dn$ & $\textit{CFUN}(?)$\\ |
216 $\textit{ALL}$ & $\dn$ & $\textit{CFUN}(?)$ |
221 $\textit{ALL}$ & $\dn$ & $\textit{CFUN}(?)$ |
217 \end{tabular} |
222 \end{tabular} |
218 \end{center} |
223 \end{center} |
219 |
224 |
|
225 \noindent |
|
226 You can either add the constructor $CFUN$ to your implementation in |
|
227 Question 3, or you can implement this questions first |
|
228 and then use $CFUN$ instead of \code{RANGE} and \code{CHAR} in Question 3. |
|
229 |
220 |
230 |
221 \subsection*{Question 5} |
231 \subsection*{Question 5} |
222 |
232 |
223 Suppose $[a\mbox{-}z0\mbox{-}9\_\,.\mbox{-}]$ stands for the regular expression |
233 Suppose $[a\mbox{-}z0\mbox{-}9\_\,.\mbox{-}]$ stands for the regular expression |
224 |
234 |
229 |
239 |
230 \[ |
240 \[ |
231 ([a\mbox{-}z0\mbox{-}9\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}}) |
241 ([a\mbox{-}z0\mbox{-}9\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}}) |
232 \] |
242 \] |
233 |
243 |
234 \noindent and calculate the derivative according to your email |
244 \noindent and calculate the derivative according to your own email |
235 address. When calculating the derivative, simplify all regular |
245 address. When calculating the derivative, simplify all regular |
236 expressions as much as possible by applying the |
246 expressions as much as possible by applying the |
237 following 7 simplification rules: |
247 following 7 simplification rules: |
238 |
248 |
239 \begin{center} |
249 \begin{center} |
249 \end{center} |
259 \end{center} |
250 |
260 |
251 \noindent Write down your simplified derivative in a readable |
261 \noindent Write down your simplified derivative in a readable |
252 notation using parentheses where necessary. That means you |
262 notation using parentheses where necessary. That means you |
253 should use the infix notation $+$, $\cdot$, $^*$ and so on, |
263 should use the infix notation $+$, $\cdot$, $^*$ and so on, |
254 instead of code.\bigskip |
264 instead of raw code.\bigskip |
255 |
265 |
256 \noindent |
266 |
|
267 \subsection*{Question 6} |
|
268 |
257 Implement the simplification rules in your regular expression matcher. |
269 Implement the simplification rules in your regular expression matcher. |
258 Consider the regular expression $/ \cdot * \cdot |
270 Consider the regular expression $/ \cdot * \cdot |
259 (\sim{}(\textit{ALL}^* \cdot * \cdot / \cdot \textit{ALL}^*)) \cdot * |
271 (\sim{}(\textit{ALL}^* \cdot * \cdot / \cdot \textit{ALL}^*)) \cdot * |
260 \cdot /$ and decide wether the following four strings are matched by |
272 \cdot /$ and decide wether the following four strings are matched by |
261 this regular expression. Answer yes or no. |
273 this regular expression. Answer yes or no. |
265 \item \texttt{"/*foobar*/"} |
277 \item \texttt{"/*foobar*/"} |
266 \item \texttt{"/*test*/test*/"} |
278 \item \texttt{"/*test*/test*/"} |
267 \item \texttt{"/*test/*test*/"} |
279 \item \texttt{"/*test/*test*/"} |
268 \end{enumerate} |
280 \end{enumerate} |
269 |
281 |
270 \subsection*{Question 6} |
282 \subsection*{Question 7} |
271 |
283 |
272 Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be |
284 Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be |
273 $(a^{\{19,19\}}) \cdot (a^?)$. Decide whether the following three |
285 $(a^{\{19,19\}}) \cdot (a^?)$. Decide whether the following three |
274 strings consisting of $a$s only can be matched by $(r_1^+)^+$. |
286 strings consisting of $a$s only can be matched by $(r_1^+)^+$. |
275 Similarly test them with $(r_2^+)^+$. Again answer in all six cases |
287 Similarly test them with $(r_2^+)^+$. Again answer in all six cases |