| author | Christian Urban <urbanc@in.tum.de> | 
| Fri, 05 Oct 2018 11:07:57 +0100 | |
| changeset 572 | 96af3fbdcd8d | 
| parent 567 | a48605bdf467 | 
| child 576 | 414f1daf5728 | 
| permissions | -rw-r--r-- | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
1  | 
\documentclass{article}
 | 
| 
253
 
75c469893514
added coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
2  | 
\usepackage{../style}
 | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
3  | 
\usepackage{../langs}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
4  | 
|
| 492 | 5  | 
\usepackage{array}
 | 
6  | 
||
7  | 
||
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
8  | 
\begin{document}
 | 
| 492 | 9  | 
\newcolumntype{C}[1]{>{\centering}m{#1}}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
10  | 
|
| 
260
 
65d1ea0e989f
updated cws
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
259 
diff
changeset
 | 
11  | 
\section*{Coursework 1 (Strand 1)}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
12  | 
|
| 556 | 13  | 
This coursework is worth 4\% and is due on 12 October at  | 
| 567 | 14  | 
18:00. You are asked to implement a regular expression matcher  | 
| 
358
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
15  | 
and submit a document containing the answers for the questions  | 
| 
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
16  | 
below. You can do the implementation in any programming  | 
| 
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
17  | 
language you like, but you need to submit the source code with  | 
| 
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
18  | 
which you answered the questions, otherwise a mark of 0\% will  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
19  | 
be awarded. You can submit your answers in a txt-file or pdf.  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
20  | 
Code send as code.  | 
| 
358
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
21  | 
|
| 
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
22  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
23  | 
|
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
24  | 
\subsubsection*{Disclaimer}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
25  | 
|
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
26  | 
It should be understood that the work you submit represents  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
27  | 
your own effort. You have not copied from anyone else. An  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
28  | 
exception is the Scala code I showed during the lectures or  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
29  | 
uploaded to KEATS, which you can freely use.\bigskip  | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
30  | 
|
| 512 | 31  | 
\noindent  | 
32  | 
If you have any questions, please send me an email in \textbf{good}
 | 
|
33  | 
time.\bigskip  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
34  | 
|
| 492 | 35  | 
\subsection*{Task}
 | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
36  | 
|
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
37  | 
The task is to implement a regular expression matcher based on  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
38  | 
derivatives of regular expressions. The implementation should  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
39  | 
be able to deal with the usual (basic) regular expressions  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
40  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
41  | 
\[  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
42  | 
\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
43  | 
\]  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
44  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
45  | 
\noindent  | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
46  | 
but also with the following extended regular expressions:  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
47  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
48  | 
\begin{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
49  | 
\begin{tabular}{ll}
 | 
| 492 | 50  | 
$[c_1,c_2,\ldots,c_n]$ & a set of characters---for character ranges\\  | 
51  | 
$r^+$ & one or more times $r$\\  | 
|
52  | 
$r^?$ & optional $r$\\  | 
|
53  | 
  $r^{\{n\}}$ & exactly $n$-times\\
 | 
|
54  | 
  $r^{\{..m\}}$ & zero or more times $r$ but no more than $m$-times\\
 | 
|
55  | 
  $r^{\{n..\}}$ & at least $n$-times $r$\\
 | 
|
56  | 
  $r^{\{n..m\}}$ & at least $n$-times $r$ but no more than $m$-times\\
 | 
|
57  | 
  $\sim{}r$ & not-regular-expression of $r$\\
 | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
58  | 
\end{tabular}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
59  | 
\end{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
60  | 
|
| 492 | 61  | 
\noindent You can assume that $n$ and $m$ are greater or equal than  | 
62  | 
$0$. In the case of $r^{\{n,m\}}$ you can also assume $0 \le n \le m$.\bigskip
 | 
|
63  | 
||
64  | 
\noindent {\bf Important!} Your implementation should have explicit
 | 
|
| 567 | 65  | 
case classes for the basic regular expressions, but also explicit case  | 
66  | 
classes for  | 
|
67  | 
the extended regular expressions.\footnote{Please call them
 | 
|
68  | 
  \code{RANGE}, \code{PLUS}, \code{OPTIONAL}, \code{NTIMES},
 | 
|
69  | 
  \code{UPTO}, \code{FROM}, \code{BETWEEN}, \code{NOT} or something
 | 
|
70  | 
like that.} That means do not treat the extended regular expressions  | 
|
71  | 
by just translating them into the basic ones. See also Question 3,  | 
|
72  | 
where you are asked to explicitly give the rules for \textit{nullable}
 | 
|
73  | 
and \textit{der} for the extended regular expressions.\medskip
 | 
|
| 492 | 74  | 
|
75  | 
\noindent  | 
|
76  | 
The meanings of the extended regular expressions are  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
77  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
78  | 
\begin{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
79  | 
\begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}
 | 
| 492 | 80  | 
  $L([c_1,c_2,\ldots,c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ 
 | 
81  | 
  $L(r^+)$                  & $\dn$ & $\bigcup_{1\le i}.\;L(r)^i$\\
 | 
|
82  | 
  $L(r^?)$                  & $\dn$ & $L(r) \cup \{[]\}$\\
 | 
|
83  | 
  $L(r^{\{n\}})$             & $\dn$ & $L(r)^n$\\
 | 
|
84  | 
  $L(r^{\{..m\}})$           & $\dn$ & $\bigcup_{0\le i \le m}.\;L(r)^i$\\
 | 
|
85  | 
  $L(r^{\{n..\}})$           & $\dn$ & $\bigcup_{n\le i}.\;L(r)^i$\\
 | 
|
86  | 
  $L(r^{\{n..m\}})$          & $\dn$ & $\bigcup_{n\le i \le m}.\;L(r)^i$\\
 | 
|
87  | 
  $L(\sim{}r)$              & $\dn$ & $\Sigma^* - L(r)$
 | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
88  | 
\end{tabular}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
89  | 
\end{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
90  | 
|
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
91  | 
\noindent whereby in the last clause the set $\Sigma^*$ stands  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
92  | 
for the set of \emph{all} strings over the alphabet $\Sigma$
 | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
93  | 
(in the implementation the alphabet can be just what is  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
94  | 
represented by, say, the type \pcode{Char}). So $\sim{}r$
 | 
| 492 | 95  | 
means in effect ``all the strings that $r$ cannot match''.\medskip  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
96  | 
|
| 492 | 97  | 
\noindent  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
98  | 
Be careful that your implementation of \textit{nullable} and
 | 
| 492 | 99  | 
\textit{der} satisfies for every regular expression $r$ the following
 | 
| 545 | 100  | 
two properties (see also Question 3):  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
101  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
102  | 
\begin{itemize}
 | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
103  | 
\item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
 | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
104  | 
\item $L(der\,c\,r) = Der\,c\,(L(r))$  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
105  | 
\end{itemize}
 | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
106  | 
|
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
107  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
108  | 
|
| 512 | 109  | 
\subsection*{Question 1 (Unmarked)}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
110  | 
|
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
111  | 
What is your King's email address (you will need it in  | 
| 545 | 112  | 
Question 5)?  | 
113  | 
||
114  | 
\subsection*{Question 2 (Unmarked)}
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
115  | 
|
| 567 | 116  | 
Can you please list all programming languages in which you have  | 
117  | 
already written programs (like spent at least a good working day  | 
|
118  | 
working on the program)? This is just for my curiosity to estimate  | 
|
119  | 
what your background is.  | 
|
| 545 | 120  | 
|
121  | 
\subsection*{Question 3}
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
122  | 
|
| 473 | 123  | 
From the  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
124  | 
lectures you have seen the definitions for the functions  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
125  | 
\textit{nullable} and \textit{der} for the basic regular
 | 
| 567 | 126  | 
expressions. Implement and write down rules for the extended  | 
127  | 
regular expressions:  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
128  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
129  | 
\begin{center}
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
130  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
| 492 | 131  | 
  $\textit{nullable}([c_1,c_2,\ldots,c_n])$  & $\dn$ & $?$\\
 | 
132  | 
  $\textit{nullable}(r^+)$                   & $\dn$ & $?$\\
 | 
|
133  | 
  $\textit{nullable}(r^?)$                   & $\dn$ & $?$\\
 | 
|
134  | 
  $\textit{nullable}(r^{\{n\}})$              & $\dn$ & $?$\\
 | 
|
135  | 
  $\textit{nullable}(r^{\{..m\}})$            & $\dn$ & $?$\\
 | 
|
136  | 
  $\textit{nullable}(r^{\{n..\}})$            & $\dn$ & $?$\\
 | 
|
137  | 
  $\textit{nullable}(r^{\{n..m\}})$           & $\dn$ & $?$\\
 | 
|
138  | 
  $\textit{nullable}(\sim{}r)$              & $\dn$ & $?$
 | 
|
139  | 
\end{tabular}
 | 
|
140  | 
\end{center}
 | 
|
141  | 
||
142  | 
\begin{center}
 | 
|
143  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
|
144  | 
$der\, c\, ([c_1,c_2,\ldots,c_n])$ & $\dn$ & $?$\\  | 
|
145  | 
$der\, c\, (r^+)$ & $\dn$ & $?$\\  | 
|
146  | 
$der\, c\, (r^?)$ & $\dn$ & $?$\\  | 
|
147  | 
  $der\, c\, (r^{\{n\}})$              & $\dn$ & $?$\\
 | 
|
148  | 
  $der\, c\, (r^{\{..m\}})$           & $\dn$ & $?$\\
 | 
|
149  | 
  $der\, c\, (r^{\{n..\}})$           & $\dn$ & $?$\\
 | 
|
150  | 
  $der\, c\, (r^{\{n..m\}})$           & $\dn$ & $?$\\
 | 
|
151  | 
  $der\, c\, (\sim{}r)$               & $\dn$ & $?$\\
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
152  | 
\end{tabular}
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
153  | 
\end{center}
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
154  | 
|
| 
333
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
155  | 
\noindent  | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
156  | 
Remember your definitions have to satisfy the two properties  | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
157  | 
|
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
158  | 
\begin{itemize}
 | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
159  | 
\item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
 | 
| 
333
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
160  | 
\item $L(der\,c\,r)) = Der\,c\,(L(r))$  | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
161  | 
\end{itemize}
 | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
162  | 
|
| 473 | 163  | 
\noindent  | 
| 492 | 164  | 
Given the definitions of \textit{nullable} and \textit{der}, it is
 | 
165  | 
easy to implement a regular expression matcher. Test your regular  | 
|
166  | 
expression matcher with (at least) the examples:  | 
|
167  | 
||
168  | 
||
169  | 
\begin{center}
 | 
|
170  | 
\def\arraystretch{1.2}  
 | 
|
171  | 
\begin{tabular}{r|m{12mm}|m{12mm}|m{12mm}|m{12mm}|m{12mm}|m{12mm}}
 | 
|
172  | 
  string & $a^{\{3\}}$ & $(a^?)^{\{3\}}$ & $a^{\{..3\}}$ &
 | 
|
173  | 
     $(a^?)^{\{..3\}}$ & $a^{\{3..5\}}$ & $(a^?)^{\{3..5\}}$\\\hline
 | 
|
174  | 
$[]$ &&&&&& \\\hline  | 
|
175  | 
  \texttt{a}     &&&&&& \\\hline 
 | 
|
176  | 
  \texttt{aa}    &&&&&& \\\hline 
 | 
|
177  | 
  \texttt{aaa}   &&&&&& \\\hline 
 | 
|
178  | 
  \texttt{aaaaa} &&&&&& \\\hline 
 | 
|
179  | 
  \texttt{aaaaaa}&&&&&& \\
 | 
|
180  | 
\end{tabular}
 | 
|
181  | 
\end{center}
 | 
|
182  | 
||
183  | 
\noindent  | 
|
184  | 
Does your matcher produce the expected results?  | 
|
| 473 | 185  | 
|
| 545 | 186  | 
\subsection*{Question 4}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
187  | 
|
| 494 | 188  | 
As you can see, there are a number of explicit regular expressions  | 
189  | 
that deal with single or several characters, for example:  | 
|
| 492 | 190  | 
|
191  | 
\begin{center}
 | 
|
192  | 
\begin{tabular}{ll}
 | 
|
| 494 | 193  | 
$c$ & matches a single character\\  | 
194  | 
$[c_1,c_2,\ldots,c_n]$ & matches a set of characters---for character ranges\\  | 
|
195  | 
  $\textit{ALL}$ & matches any character
 | 
|
| 492 | 196  | 
\end{tabular}
 | 
197  | 
\end{center}
 | 
|
198  | 
||
199  | 
\noindent  | 
|
| 567 | 200  | 
The latter is useful for matching any string (for example  | 
| 494 | 201  | 
by using $\textit{ALL}^*$). In order to avoid having an explicit constructor
 | 
202  | 
for each case, we can generalise all these cases and introduce a single  | 
|
| 492 | 203  | 
constructor $\textit{CFUN}(f)$ where $f$ is a function from characters
 | 
| 567 | 204  | 
to booleans. The idea is that the function $f$ determines which character(s)  | 
| 494 | 205  | 
are matched, namely those where $f$ returns \texttt{true}.
 | 
206  | 
In this question implement \textit{CFUN} and define
 | 
|
| 492 | 207  | 
|
208  | 
\begin{center}
 | 
|
209  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
|
210  | 
  $\textit{nullable}(\textit{CFUN}(f))$  & $\dn$ & $?$\\
 | 
|
211  | 
  $\textit{der}\,c\,(\textit{CFUN}(f))$  & $\dn$ & $?$
 | 
|
212  | 
\end{tabular}
 | 
|
213  | 
\end{center}
 | 
|
214  | 
||
| 494 | 215  | 
\noindent in your matcher and then also give definitions for  | 
| 492 | 216  | 
|
217  | 
\begin{center}
 | 
|
218  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
|
219  | 
  $c$  & $\dn$ & $\textit{CFUN}(?)$\\
 | 
|
220  | 
  $[c_1,c_2,\ldots,c_n]$  & $\dn$ & $\textit{CFUN}(?)$\\
 | 
|
221  | 
  $\textit{ALL}$  & $\dn$ & $\textit{CFUN}(?)$
 | 
|
222  | 
\end{tabular}
 | 
|
223  | 
\end{center}
 | 
|
224  | 
||
| 567 | 225  | 
\noindent  | 
226  | 
You can either add the constructor $CFUN$ to your implementation in  | 
|
227  | 
Question 3, or you can implement this questions first  | 
|
228  | 
and then use $CFUN$ instead of \code{RANGE} and \code{CHAR} in Question 3.
 | 
|
229  | 
||
| 492 | 230  | 
|
| 545 | 231  | 
\subsection*{Question 5}
 | 
| 492 | 232  | 
|
233  | 
Suppose $[a\mbox{-}z0\mbox{-}9\_\,.\mbox{-}]$ stands for the regular expression
 | 
|
234  | 
||
235  | 
\[[a,b,c,\ldots,z,0,\dots,9,\_,.,\mbox{-}]\;.\]
 | 
|
236  | 
||
237  | 
\noindent  | 
|
238  | 
Define in your code the following regular expression for email addresses  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
239  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
240  | 
\[  | 
| 492 | 241  | 
([a\mbox{-}z0\mbox{-}9\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
242  | 
\]  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
243  | 
|
| 567 | 244  | 
\noindent and calculate the derivative according to your own email  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
245  | 
address. When calculating the derivative, simplify all regular  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
246  | 
expressions as much as possible by applying the  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
247  | 
following 7 simplification rules:  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
248  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
249  | 
\begin{center}
 | 
| 
272
 
1446bc47a294
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
260 
diff
changeset
 | 
250  | 
\begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll}
 | 
| 
439
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
251  | 
$r \cdot \ZERO$ & $\mapsto$ & $\ZERO$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
252  | 
$\ZERO \cdot r$ & $\mapsto$ & $\ZERO$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
253  | 
$r \cdot \ONE$ & $\mapsto$ & $r$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
254  | 
$\ONE \cdot r$ & $\mapsto$ & $r$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
255  | 
$r + \ZERO$ & $\mapsto$ & $r$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
256  | 
$\ZERO + r$ & $\mapsto$ & $r$\\  | 
| 
333
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
257  | 
$r + r$ & $\mapsto$ & $r$\\  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
258  | 
\end{tabular}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
259  | 
\end{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
260  | 
|
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
261  | 
\noindent Write down your simplified derivative in a readable  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
262  | 
notation using parentheses where necessary. That means you  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
263  | 
should use the infix notation $+$, $\cdot$, $^*$ and so on,  | 
| 567 | 264  | 
instead of raw code.\bigskip  | 
265  | 
||
266  | 
||
267  | 
\subsection*{Question 6}
 | 
|
268  | 
||
| 492 | 269  | 
Implement the simplification rules in your regular expression matcher.  | 
270  | 
Consider the regular expression $/ \cdot * \cdot  | 
|
271  | 
(\sim{}(\textit{ALL}^* \cdot * \cdot / \cdot \textit{ALL}^*)) \cdot *
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
272  | 
\cdot /$ and decide wether the following four strings are matched by  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
273  | 
this regular expression. Answer yes or no.  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
274  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
275  | 
\begin{enumerate}
 | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
276  | 
\item \texttt{"/**/"}
 | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
277  | 
\item \texttt{"/*foobar*/"}
 | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
278  | 
\item \texttt{"/*test*/test*/"}
 | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
279  | 
\item \texttt{"/*test/*test*/"}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
280  | 
\end{enumerate}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
281  | 
|
| 567 | 282  | 
\subsection*{Question 7}
 | 
| 512 | 283  | 
|
284  | 
Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
285  | 
$(a^{\{19,19\}}) \cdot (a^?)$.  Decide whether the following three
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
286  | 
strings consisting of $a$s only can be matched by $(r_1^+)^+$.  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
287  | 
Similarly test them with $(r_2^+)^+$. Again answer in all six cases  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
288  | 
with yes or no. \medskip  | 
| 
130
 
5c4998375c46
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
129 
diff
changeset
 | 
289  | 
|
| 
 
5c4998375c46
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
129 
diff
changeset
 | 
290  | 
\noindent  | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
291  | 
These are strings are meant to be entirely made up of $a$s. Be careful  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
292  | 
when copy-and-pasting the strings so as to not forgetting any $a$ and  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
293  | 
to not introducing any other character.  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
294  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
295  | 
\begin{enumerate}
 | 
| 492 | 296  | 
\setcounter{enumi}{4}
 | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
297  | 
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
298  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\  | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
299  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}  | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
300  | 
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
301  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\  | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
302  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}  | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
303  | 
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
304  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\  | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
305  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
306  | 
\end{enumerate}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
307  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
308  | 
|
| 492 | 309  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
310  | 
\end{document}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
311  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
312  | 
%%% Local Variables:  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
313  | 
%%% mode: latex  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
314  | 
%%% TeX-master: t  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
315  | 
%%% End:  |