|
1 \documentclass{article} |
|
2 \usepackage{charter} |
|
3 \usepackage{hyperref} |
|
4 \usepackage{amssymb} |
|
5 \usepackage{amsmath} |
|
6 |
|
7 \newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}% |
|
8 \begin{document} |
|
9 |
|
10 \section*{Coursework 1} |
|
11 |
|
12 This coursework is worth 3\% and is due on 12 November at 16:00. You are asked to implement |
|
13 a regular expression matcher and submit a document containing the answers for the questions |
|
14 below. You can do the implementation in any programming language you like, but you need |
|
15 to submit the source code with which you answered the questions. However, the coursework |
|
16 will \emph{only} be judged according to the answers only. \bigskip |
|
17 |
|
18 \noindent |
|
19 The task is to implement a regular expression matcher based on derivatives. The implementation |
|
20 should be able to deal with the usual regular expressions |
|
21 |
|
22 \[ |
|
23 \varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^* |
|
24 \] |
|
25 |
|
26 \noindent |
|
27 but also with |
|
28 |
|
29 \begin{center} |
|
30 \begin{tabular}{ll} |
|
31 $[c_1 c_2 \ldots c_n]$ & a range of characters\\ |
|
32 $r^+$ & one or more times $r$\\ |
|
33 $r^?$ & optional $r$\\ |
|
34 $r^{\{n,m\}}$ & at least $n$-times $r$ but no more than $m$-times\\ |
|
35 $\sim{}r$ & not-regular expression of $r$\\ |
|
36 \end{tabular} |
|
37 \end{center} |
|
38 |
|
39 \noindent |
|
40 In the case of $r^{\{n,m\}}$ we have the convention that $0 \le n \le m$. |
|
41 The meaning of these regular expressions is |
|
42 |
|
43 \begin{center} |
|
44 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l} |
|
45 $L([c_1 c_2 \ldots c_n])$ & $\dn$ & $\{"c_1", "c_2", \ldots, "c_n"\}$\\ |
|
46 $L(r^+)$ & $\dn$ & $\bigcup_{1\le i}. L(r)^i$\\ |
|
47 $L(r^?)$ & $\dn$ & $L(r) \cup \{""\}$\\ |
|
48 $L(r^{\{n,m\}})$ & $\dn$ & $\bigcup_{n\le i \le m}. L(r)^i$\\ |
|
49 $L(\sim{}r)$ & $\dn$ & $UNIV - L(r)$ |
|
50 \end{tabular} |
|
51 \end{center} |
|
52 |
|
53 \noindent |
|
54 whereby in the last clause the set $UNIV$ stands for the set of \emph{all} strings. |
|
55 So $\sim{}r$ means `all the strings that $r$ cannot match'. We assume ranges |
|
56 like $[a\mbox{-}z0\mbox{-}9]$ are a shorthand for the regular expression |
|
57 |
|
58 \[ |
|
59 [a b c d\ldots z 0 1\ldots 9]\;. |
|
60 \] |
|
61 |
|
62 \noindent |
|
63 Be careful that your implementations for $nullable$ and $der$ satisfies for every $r$ the following two |
|
64 properties: |
|
65 |
|
66 \begin{itemize} |
|
67 \item $nullable(r)$ if and only if $""\in L(r)$ |
|
68 \item $L(der\,c\,r)) = Der\,c\,(L(r))$ |
|
69 \end{itemize} |
|
70 |
|
71 \subsection*{Question 1 (unmarked)} |
|
72 |
|
73 What is your King's email address (you will need it in the questions below)?\bigskip |
|
74 |
|
75 \subsection*{Question 2 (marked with 1\%)} |
|
76 |
|
77 Implement the following regular expression for email addresses |
|
78 |
|
79 \[ |
|
80 ([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}}) |
|
81 \] |
|
82 |
|
83 \noindent |
|
84 and calculate the derivative according to your email address. When calculating |
|
85 the derivative, simplify all regular expressions as much as possible, but at least apply the following |
|
86 six rules: |
|
87 |
|
88 \begin{center} |
|
89 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}l} |
|
90 $r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ |
|
91 $\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ |
|
92 $r \cdot \epsilon$ & $\mapsto$ & $r$\\ |
|
93 $\epsilon \cdot r$ & $\mapsto$ & $r$\\ |
|
94 $r + \varnothing$ & $\mapsto$ & $r$\\ |
|
95 $\varnothing + r$ & $\mapsto$ & $r$\\ |
|
96 \end{tabular} |
|
97 \end{center} |
|
98 |
|
99 \noindent |
|
100 Write down your simplified derivative. |
|
101 |
|
102 \subsection*{Question 3 (marked with 1\%)} |
|
103 |
|
104 Consider the regular expression $/ \cdot * \cdot (\sim{}([a\mbox{-}z]^* \cdot * \cdot / \cdot [a\mbox{-}z]^*)) \cdot * \cdot /$ and decide |
|
105 wether the following four strings are matched by this regular expression. Answer yes or no. |
|
106 |
|
107 \begin{enumerate} |
|
108 \item "/**/" |
|
109 \item "/*foobar*/" |
|
110 \item "/*test*/test*/" |
|
111 \item "/*test/*test*/" |
|
112 \end{enumerate} |
|
113 |
|
114 \subsection*{Question 4 (marked with 1\%)} |
|
115 |
|
116 Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be $(a^{\{19,19\}}) \cdot (a^?)$. |
|
117 Decide whether the following three strings can be matched by $(r_1^+)^+$. Similarly test them with $(r_2^+)^+$. |
|
118 Again answer in all six cases with yes or no. |
|
119 |
|
120 \begin{enumerate} |
|
121 \item $"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ |
|
122 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ |
|
123 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"$ |
|
124 \item $"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ |
|
125 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ |
|
126 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"$ |
|
127 \item$"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ |
|
128 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ |
|
129 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"$ |
|
130 \end{enumerate} |
|
131 |
|
132 |
|
133 \end{document} |
|
134 |
|
135 %%% Local Variables: |
|
136 %%% mode: latex |
|
137 %%% TeX-master: t |
|
138 %%% End: |