| author | Christian Urban <christian.urban@kcl.ac.uk> | 
| Mon, 26 Oct 2020 10:27:01 +0000 | |
| changeset 791 | d27d35a0164a | 
| parent 772 | b1a8ef39cb35 | 
| child 833 | 7c3b8bb4a174 | 
| permissions | -rw-r--r-- | 
| 630 | 1  | 
% !TEX program = xelatex  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
2  | 
\documentclass{article}
 | 
| 
253
 
75c469893514
added coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
3  | 
\usepackage{../style}
 | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
4  | 
\usepackage{../langs}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
5  | 
|
| 492 | 6  | 
\usepackage{array}
 | 
7  | 
||
8  | 
||
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
9  | 
\begin{document}
 | 
| 492 | 10  | 
\newcolumntype{C}[1]{>{\centering}m{#1}}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
11  | 
|
| 748 | 12  | 
\section*{Coursework 1}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
13  | 
|
| 748 | 14  | 
This coursework is worth 5\% and is due on \cwONE{} at 18:00. You are
 | 
15  | 
asked to implement a regular expression matcher and submit a document  | 
|
16  | 
containing the answers for the questions below. You can do the  | 
|
17  | 
implementation in any programming language you like, but you need to  | 
|
18  | 
submit the source code with which you answered the questions,  | 
|
19  | 
otherwise a mark of 0\% will be awarded. You can submit your answers  | 
|
20  | 
in a txt-file or pdf. Code send as code. Please package everything  | 
|
21  | 
inside a zip-file that creates a directory with the name  | 
|
22  | 
\[\texttt{YournameYourfamilyname}\]
 | 
|
23  | 
||
24  | 
\noindent on my end. Thanks!  | 
|
| 
358
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
25  | 
|
| 
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
26  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
27  | 
|
| 750 | 28  | 
\subsubsection*{Disclaimer\alert}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
29  | 
|
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
30  | 
It should be understood that the work you submit represents  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
31  | 
your own effort. You have not copied from anyone else. An  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
32  | 
exception is the Scala code I showed during the lectures or  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
33  | 
uploaded to KEATS, which you can freely use.\bigskip  | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
34  | 
|
| 512 | 35  | 
\noindent  | 
36  | 
If you have any questions, please send me an email in \textbf{good}
 | 
|
37  | 
time.\bigskip  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
38  | 
|
| 492 | 39  | 
\subsection*{Task}
 | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
40  | 
|
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
41  | 
The task is to implement a regular expression matcher based on  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
42  | 
derivatives of regular expressions. The implementation should  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
43  | 
be able to deal with the usual (basic) regular expressions  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
44  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
45  | 
\[  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
46  | 
\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
47  | 
\]  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
48  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
49  | 
\noindent  | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
50  | 
but also with the following extended regular expressions:  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
51  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
52  | 
\begin{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
53  | 
\begin{tabular}{ll}
 | 
| 492 | 54  | 
$[c_1,c_2,\ldots,c_n]$ & a set of characters---for character ranges\\  | 
55  | 
$r^+$ & one or more times $r$\\  | 
|
56  | 
$r^?$ & optional $r$\\  | 
|
57  | 
  $r^{\{n\}}$ & exactly $n$-times\\
 | 
|
58  | 
  $r^{\{..m\}}$ & zero or more times $r$ but no more than $m$-times\\
 | 
|
59  | 
  $r^{\{n..\}}$ & at least $n$-times $r$\\
 | 
|
60  | 
  $r^{\{n..m\}}$ & at least $n$-times $r$ but no more than $m$-times\\
 | 
|
61  | 
  $\sim{}r$ & not-regular-expression of $r$\\
 | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
62  | 
\end{tabular}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
63  | 
\end{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
64  | 
|
| 492 | 65  | 
\noindent You can assume that $n$ and $m$ are greater or equal than  | 
66  | 
$0$. In the case of $r^{\{n,m\}}$ you can also assume $0 \le n \le m$.\bigskip
 | 
|
67  | 
||
68  | 
\noindent {\bf Important!} Your implementation should have explicit
 | 
|
| 567 | 69  | 
case classes for the basic regular expressions, but also explicit case  | 
70  | 
classes for  | 
|
71  | 
the extended regular expressions.\footnote{Please call them
 | 
|
72  | 
  \code{RANGE}, \code{PLUS}, \code{OPTIONAL}, \code{NTIMES},
 | 
|
| 630 | 73  | 
  \code{UPTO}, \code{FROM} and \code{BETWEEN}.} 
 | 
74  | 
That means do not treat the extended regular expressions  | 
|
| 567 | 75  | 
by just translating them into the basic ones. See also Question 3,  | 
76  | 
where you are asked to explicitly give the rules for \textit{nullable}
 | 
|
| 748 | 77  | 
and \textit{der} for the extended regular expressions. Something like
 | 
78  | 
||
79  | 
\[der\,c\,(r^+) \dn der\,c\,(r\cdot r^*)\]  | 
|
80  | 
||
81  | 
\noindent is \emph{not} allowed as answer in Question 3 and \emph{not}
 | 
|
82  | 
allowed in your code.\medskip  | 
|
| 492 | 83  | 
|
84  | 
\noindent  | 
|
85  | 
The meanings of the extended regular expressions are  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
86  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
87  | 
\begin{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
88  | 
\begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}
 | 
| 492 | 89  | 
  $L([c_1,c_2,\ldots,c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ 
 | 
90  | 
  $L(r^+)$                  & $\dn$ & $\bigcup_{1\le i}.\;L(r)^i$\\
 | 
|
91  | 
  $L(r^?)$                  & $\dn$ & $L(r) \cup \{[]\}$\\
 | 
|
92  | 
  $L(r^{\{n\}})$             & $\dn$ & $L(r)^n$\\
 | 
|
93  | 
  $L(r^{\{..m\}})$           & $\dn$ & $\bigcup_{0\le i \le m}.\;L(r)^i$\\
 | 
|
94  | 
  $L(r^{\{n..\}})$           & $\dn$ & $\bigcup_{n\le i}.\;L(r)^i$\\
 | 
|
95  | 
  $L(r^{\{n..m\}})$          & $\dn$ & $\bigcup_{n\le i \le m}.\;L(r)^i$\\
 | 
|
96  | 
  $L(\sim{}r)$              & $\dn$ & $\Sigma^* - L(r)$
 | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
97  | 
\end{tabular}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
98  | 
\end{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
99  | 
|
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
100  | 
\noindent whereby in the last clause the set $\Sigma^*$ stands  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
101  | 
for the set of \emph{all} strings over the alphabet $\Sigma$
 | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
102  | 
(in the implementation the alphabet can be just what is  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
103  | 
represented by, say, the type \pcode{Char}). So $\sim{}r$
 | 
| 492 | 104  | 
means in effect ``all the strings that $r$ cannot match''.\medskip  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
105  | 
|
| 492 | 106  | 
\noindent  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
107  | 
Be careful that your implementation of \textit{nullable} and
 | 
| 492 | 108  | 
\textit{der} satisfies for every regular expression $r$ the following
 | 
| 545 | 109  | 
two properties (see also Question 3):  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
110  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
111  | 
\begin{itemize}
 | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
112  | 
\item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
 | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
113  | 
\item $L(der\,c\,r) = Der\,c\,(L(r))$  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
114  | 
\end{itemize}
 | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
115  | 
|
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
116  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
117  | 
|
| 512 | 118  | 
\subsection*{Question 1 (Unmarked)}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
119  | 
|
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
120  | 
What is your King's email address (you will need it in  | 
| 772 | 121  | 
Question 5)? Also could you please let me know from where you will be mainly  | 
122  | 
studying? Thanks!  | 
|
| 545 | 123  | 
|
124  | 
\subsection*{Question 2 (Unmarked)}
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
125  | 
|
| 567 | 126  | 
Can you please list all programming languages in which you have  | 
| 748 | 127  | 
already written programs (include only instances where you have spent  | 
128  | 
at least a good working day fiddling with a program)? This is just  | 
|
129  | 
for my curiosity to estimate what your background is.  | 
|
| 545 | 130  | 
|
131  | 
\subsection*{Question 3}
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
132  | 
|
| 473 | 133  | 
From the  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
134  | 
lectures you have seen the definitions for the functions  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
135  | 
\textit{nullable} and \textit{der} for the basic regular
 | 
| 718 | 136  | 
expressions. Implement and write down the rules for the extended  | 
| 567 | 137  | 
regular expressions:  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
138  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
139  | 
\begin{center}
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
140  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
| 492 | 141  | 
  $\textit{nullable}([c_1,c_2,\ldots,c_n])$  & $\dn$ & $?$\\
 | 
142  | 
  $\textit{nullable}(r^+)$                   & $\dn$ & $?$\\
 | 
|
143  | 
  $\textit{nullable}(r^?)$                   & $\dn$ & $?$\\
 | 
|
144  | 
  $\textit{nullable}(r^{\{n\}})$              & $\dn$ & $?$\\
 | 
|
145  | 
  $\textit{nullable}(r^{\{..m\}})$            & $\dn$ & $?$\\
 | 
|
146  | 
  $\textit{nullable}(r^{\{n..\}})$            & $\dn$ & $?$\\
 | 
|
147  | 
  $\textit{nullable}(r^{\{n..m\}})$           & $\dn$ & $?$\\
 | 
|
148  | 
  $\textit{nullable}(\sim{}r)$              & $\dn$ & $?$
 | 
|
149  | 
\end{tabular}
 | 
|
150  | 
\end{center}
 | 
|
151  | 
||
152  | 
\begin{center}
 | 
|
153  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
|
154  | 
$der\, c\, ([c_1,c_2,\ldots,c_n])$ & $\dn$ & $?$\\  | 
|
155  | 
$der\, c\, (r^+)$ & $\dn$ & $?$\\  | 
|
156  | 
$der\, c\, (r^?)$ & $\dn$ & $?$\\  | 
|
157  | 
  $der\, c\, (r^{\{n\}})$              & $\dn$ & $?$\\
 | 
|
158  | 
  $der\, c\, (r^{\{..m\}})$           & $\dn$ & $?$\\
 | 
|
159  | 
  $der\, c\, (r^{\{n..\}})$           & $\dn$ & $?$\\
 | 
|
160  | 
  $der\, c\, (r^{\{n..m\}})$           & $\dn$ & $?$\\
 | 
|
161  | 
  $der\, c\, (\sim{}r)$               & $\dn$ & $?$\\
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
162  | 
\end{tabular}
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
163  | 
\end{center}
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
164  | 
|
| 
333
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
165  | 
\noindent  | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
166  | 
Remember your definitions have to satisfy the two properties  | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
167  | 
|
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
168  | 
\begin{itemize}
 | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
169  | 
\item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
 | 
| 
333
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
170  | 
\item $L(der\,c\,r)) = Der\,c\,(L(r))$  | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
171  | 
\end{itemize}
 | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
172  | 
|
| 473 | 173  | 
\noindent  | 
| 492 | 174  | 
Given the definitions of \textit{nullable} and \textit{der}, it is
 | 
175  | 
easy to implement a regular expression matcher. Test your regular  | 
|
176  | 
expression matcher with (at least) the examples:  | 
|
177  | 
||
178  | 
||
179  | 
\begin{center}
 | 
|
180  | 
\def\arraystretch{1.2}  
 | 
|
| 748 | 181  | 
\begin{tabular}{@{}r|m{3mm}|m{6mm}|m{6mm}|m{10mm}|m{6mm}|m{10mm}|m{10mm}|m{10mm}}
 | 
| 718 | 182  | 
  string & $a^?$ & $\sim{}a$ & $a^{\{3\}}$ & $(a^?)^{\{3\}}$ & $a^{\{..3\}}$ &
 | 
183  | 
     $(a^?)^{\{..3\}}$ & $a^{\{3..5\}}$ & $(a^?)^{\{3..5\}}$ \\\hline
 | 
|
184  | 
$[]$ &&&&&&& \\\hline  | 
|
185  | 
  \texttt{a}     &&&&&&& \\\hline 
 | 
|
186  | 
  \texttt{aa}    &&&&&&& \\\hline 
 | 
|
187  | 
  \texttt{aaa}   &&&&&&& \\\hline 
 | 
|
188  | 
  \texttt{aaaaa} &&&&&&& \\\hline 
 | 
|
189  | 
  \texttt{aaaaaa}&&&&&&& \\
 | 
|
| 492 | 190  | 
\end{tabular}
 | 
191  | 
\end{center}
 | 
|
192  | 
||
193  | 
\noindent  | 
|
| 718 | 194  | 
Does your matcher produce the expected results? Make sure you  | 
195  | 
also test corner-cases, like $a^{\{0\}}$!
 | 
|
| 473 | 196  | 
|
| 545 | 197  | 
\subsection*{Question 4}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
198  | 
|
| 494 | 199  | 
As you can see, there are a number of explicit regular expressions  | 
200  | 
that deal with single or several characters, for example:  | 
|
| 492 | 201  | 
|
202  | 
\begin{center}
 | 
|
203  | 
\begin{tabular}{ll}
 | 
|
| 494 | 204  | 
$c$ & matches a single character\\  | 
205  | 
$[c_1,c_2,\ldots,c_n]$ & matches a set of characters---for character ranges\\  | 
|
206  | 
  $\textit{ALL}$ & matches any character
 | 
|
| 492 | 207  | 
\end{tabular}
 | 
208  | 
\end{center}
 | 
|
209  | 
||
210  | 
\noindent  | 
|
| 567 | 211  | 
The latter is useful for matching any string (for example  | 
| 494 | 212  | 
by using $\textit{ALL}^*$). In order to avoid having an explicit constructor
 | 
213  | 
for each case, we can generalise all these cases and introduce a single  | 
|
| 492 | 214  | 
constructor $\textit{CFUN}(f)$ where $f$ is a function from characters
 | 
| 576 | 215  | 
to booleans. In Scala code this would look as follows:  | 
216  | 
||
217  | 
\begin{lstlisting}[numbers=none]
 | 
|
218  | 
abstract class Rexp  | 
|
219  | 
...  | 
|
220  | 
case class CFUN(f: Char => Boolean) extends Rexp  | 
|
221  | 
\end{lstlisting}\smallskip
 | 
|
222  | 
||
223  | 
\noindent  | 
|
224  | 
The idea is that the function $f$ determines which character(s)  | 
|
| 494 | 225  | 
are matched, namely those where $f$ returns \texttt{true}.
 | 
226  | 
In this question implement \textit{CFUN} and define
 | 
|
| 492 | 227  | 
|
228  | 
\begin{center}
 | 
|
229  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
|
230  | 
  $\textit{nullable}(\textit{CFUN}(f))$  & $\dn$ & $?$\\
 | 
|
231  | 
  $\textit{der}\,c\,(\textit{CFUN}(f))$  & $\dn$ & $?$
 | 
|
232  | 
\end{tabular}
 | 
|
233  | 
\end{center}
 | 
|
234  | 
||
| 494 | 235  | 
\noindent in your matcher and then also give definitions for  | 
| 492 | 236  | 
|
237  | 
\begin{center}
 | 
|
238  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
|
239  | 
  $c$  & $\dn$ & $\textit{CFUN}(?)$\\
 | 
|
240  | 
  $[c_1,c_2,\ldots,c_n]$  & $\dn$ & $\textit{CFUN}(?)$\\
 | 
|
241  | 
  $\textit{ALL}$  & $\dn$ & $\textit{CFUN}(?)$
 | 
|
242  | 
\end{tabular}
 | 
|
243  | 
\end{center}
 | 
|
244  | 
||
| 567 | 245  | 
\noindent  | 
246  | 
You can either add the constructor $CFUN$ to your implementation in  | 
|
247  | 
Question 3, or you can implement this questions first  | 
|
248  | 
and then use $CFUN$ instead of \code{RANGE} and \code{CHAR} in Question 3.
 | 
|
249  | 
||
| 492 | 250  | 
|
| 545 | 251  | 
\subsection*{Question 5}
 | 
| 492 | 252  | 
|
253  | 
Suppose $[a\mbox{-}z0\mbox{-}9\_\,.\mbox{-}]$ stands for the regular expression
 | 
|
254  | 
||
255  | 
\[[a,b,c,\ldots,z,0,\dots,9,\_,.,\mbox{-}]\;.\]
 | 
|
256  | 
||
257  | 
\noindent  | 
|
258  | 
Define in your code the following regular expression for email addresses  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
259  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
260  | 
\[  | 
| 492 | 261  | 
([a\mbox{-}z0\mbox{-}9\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
262  | 
\]  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
263  | 
|
| 567 | 264  | 
\noindent and calculate the derivative according to your own email  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
265  | 
address. When calculating the derivative, simplify all regular  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
266  | 
expressions as much as possible by applying the  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
267  | 
following 7 simplification rules:  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
268  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
269  | 
\begin{center}
 | 
| 
272
 
1446bc47a294
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
260 
diff
changeset
 | 
270  | 
\begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll}
 | 
| 
439
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
271  | 
$r \cdot \ZERO$ & $\mapsto$ & $\ZERO$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
272  | 
$\ZERO \cdot r$ & $\mapsto$ & $\ZERO$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
273  | 
$r \cdot \ONE$ & $\mapsto$ & $r$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
274  | 
$\ONE \cdot r$ & $\mapsto$ & $r$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
275  | 
$r + \ZERO$ & $\mapsto$ & $r$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
276  | 
$\ZERO + r$ & $\mapsto$ & $r$\\  | 
| 
333
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
277  | 
$r + r$ & $\mapsto$ & $r$\\  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
278  | 
\end{tabular}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
279  | 
\end{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
280  | 
|
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
281  | 
\noindent Write down your simplified derivative in a readable  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
282  | 
notation using parentheses where necessary. That means you  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
283  | 
should use the infix notation $+$, $\cdot$, $^*$ and so on,  | 
| 567 | 284  | 
instead of raw code.\bigskip  | 
285  | 
||
286  | 
||
287  | 
\subsection*{Question 6}
 | 
|
288  | 
||
| 492 | 289  | 
Implement the simplification rules in your regular expression matcher.  | 
290  | 
Consider the regular expression $/ \cdot * \cdot  | 
|
291  | 
(\sim{}(\textit{ALL}^* \cdot * \cdot / \cdot \textit{ALL}^*)) \cdot *
 | 
|
| 630 | 292  | 
\cdot /$ and decide whether the following four strings are matched by  | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
293  | 
this regular expression. Answer yes or no.  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
294  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
295  | 
\begin{enumerate}
 | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
296  | 
\item \texttt{"/**/"}
 | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
297  | 
\item \texttt{"/*foobar*/"}
 | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
298  | 
\item \texttt{"/*test*/test*/"}
 | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
299  | 
\item \texttt{"/*test/*test*/"}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
300  | 
\end{enumerate}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
301  | 
|
| 567 | 302  | 
\subsection*{Question 7}
 | 
| 512 | 303  | 
|
304  | 
Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be  | 
|
| 748 | 305  | 
$(a^{\{19,19\}}) \cdot (a^?)$.\medskip
 | 
306  | 
||
307  | 
\noindent  | 
|
308  | 
Decide whether the following three  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
309  | 
strings consisting of $a$s only can be matched by $(r_1^+)^+$.  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
310  | 
Similarly test them with $(r_2^+)^+$. Again answer in all six cases  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
311  | 
with yes or no. \medskip  | 
| 
130
 
5c4998375c46
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
129 
diff
changeset
 | 
312  | 
|
| 
 
5c4998375c46
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
129 
diff
changeset
 | 
313  | 
\noindent  | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
314  | 
These are strings are meant to be entirely made up of $a$s. Be careful  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
315  | 
when copy-and-pasting the strings so as to not forgetting any $a$ and  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
316  | 
to not introducing any other character.  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
317  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
318  | 
\begin{enumerate}
 | 
| 492 | 319  | 
\setcounter{enumi}{4}
 | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
320  | 
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
321  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\  | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
322  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}  | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
323  | 
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
324  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\  | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
325  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}  | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
326  | 
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
327  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\  | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
328  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
329  | 
\end{enumerate}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
330  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
331  | 
|
| 492 | 332  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
333  | 
\end{document}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
334  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
335  | 
%%% Local Variables:  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
336  | 
%%% mode: latex  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
337  | 
%%% TeX-master: t  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
338  | 
%%% End:  |