| author | Christian Urban <urbanc@in.tum.de> | 
| Fri, 12 Oct 2018 10:16:54 +0100 | |
| changeset 576 | 414f1daf5728 | 
| parent 567 | a48605bdf467 | 
| child 577 | 1d6043a87a3e | 
| permissions | -rw-r--r-- | 
| 576 | 1  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
2  | 
\documentclass{article}
 | 
| 
253
 
75c469893514
added coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
3  | 
\usepackage{../style}
 | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
4  | 
\usepackage{../langs}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
5  | 
|
| 492 | 6  | 
\usepackage{array}
 | 
7  | 
||
8  | 
||
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
9  | 
\begin{document}
 | 
| 492 | 10  | 
\newcolumntype{C}[1]{>{\centering}m{#1}}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
11  | 
|
| 
260
 
65d1ea0e989f
updated cws
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
259 
diff
changeset
 | 
12  | 
\section*{Coursework 1 (Strand 1)}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
13  | 
|
| 556 | 14  | 
This coursework is worth 4\% and is due on 12 October at  | 
| 567 | 15  | 
18:00. You are asked to implement a regular expression matcher  | 
| 
358
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
16  | 
and submit a document containing the answers for the questions  | 
| 
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
17  | 
below. You can do the implementation in any programming  | 
| 
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
18  | 
language you like, but you need to submit the source code with  | 
| 
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
19  | 
which you answered the questions, otherwise a mark of 0\% will  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
20  | 
be awarded. You can submit your answers in a txt-file or pdf.  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
21  | 
Code send as code.  | 
| 
358
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
22  | 
|
| 
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
351 
diff
changeset
 | 
23  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
24  | 
|
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
25  | 
\subsubsection*{Disclaimer}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
26  | 
|
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
27  | 
It should be understood that the work you submit represents  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
28  | 
your own effort. You have not copied from anyone else. An  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
29  | 
exception is the Scala code I showed during the lectures or  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
30  | 
uploaded to KEATS, which you can freely use.\bigskip  | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
31  | 
|
| 512 | 32  | 
\noindent  | 
33  | 
If you have any questions, please send me an email in \textbf{good}
 | 
|
34  | 
time.\bigskip  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
35  | 
|
| 492 | 36  | 
\subsection*{Task}
 | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
37  | 
|
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
38  | 
The task is to implement a regular expression matcher based on  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
39  | 
derivatives of regular expressions. The implementation should  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
40  | 
be able to deal with the usual (basic) regular expressions  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
41  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
42  | 
\[  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
43  | 
\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
44  | 
\]  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
45  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
46  | 
\noindent  | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
47  | 
but also with the following extended regular expressions:  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
48  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
49  | 
\begin{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
50  | 
\begin{tabular}{ll}
 | 
| 492 | 51  | 
$[c_1,c_2,\ldots,c_n]$ & a set of characters---for character ranges\\  | 
52  | 
$r^+$ & one or more times $r$\\  | 
|
53  | 
$r^?$ & optional $r$\\  | 
|
54  | 
  $r^{\{n\}}$ & exactly $n$-times\\
 | 
|
55  | 
  $r^{\{..m\}}$ & zero or more times $r$ but no more than $m$-times\\
 | 
|
56  | 
  $r^{\{n..\}}$ & at least $n$-times $r$\\
 | 
|
57  | 
  $r^{\{n..m\}}$ & at least $n$-times $r$ but no more than $m$-times\\
 | 
|
58  | 
  $\sim{}r$ & not-regular-expression of $r$\\
 | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
59  | 
\end{tabular}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
60  | 
\end{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
61  | 
|
| 492 | 62  | 
\noindent You can assume that $n$ and $m$ are greater or equal than  | 
63  | 
$0$. In the case of $r^{\{n,m\}}$ you can also assume $0 \le n \le m$.\bigskip
 | 
|
64  | 
||
65  | 
\noindent {\bf Important!} Your implementation should have explicit
 | 
|
| 567 | 66  | 
case classes for the basic regular expressions, but also explicit case  | 
67  | 
classes for  | 
|
68  | 
the extended regular expressions.\footnote{Please call them
 | 
|
69  | 
  \code{RANGE}, \code{PLUS}, \code{OPTIONAL}, \code{NTIMES},
 | 
|
70  | 
  \code{UPTO}, \code{FROM}, \code{BETWEEN}, \code{NOT} or something
 | 
|
71  | 
like that.} That means do not treat the extended regular expressions  | 
|
72  | 
by just translating them into the basic ones. See also Question 3,  | 
|
73  | 
where you are asked to explicitly give the rules for \textit{nullable}
 | 
|
| 576 | 74  | 
and \textit{der} for the extended regular expressions. So something like
 | 
75  | 
$der\,c\,(r^+) \dn der\,c\,(r\cdot r^*)$ is \emph{not} allowed.\medskip
 | 
|
| 492 | 76  | 
|
77  | 
\noindent  | 
|
78  | 
The meanings of the extended regular expressions are  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
79  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
80  | 
\begin{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
81  | 
\begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}
 | 
| 492 | 82  | 
  $L([c_1,c_2,\ldots,c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ 
 | 
83  | 
  $L(r^+)$                  & $\dn$ & $\bigcup_{1\le i}.\;L(r)^i$\\
 | 
|
84  | 
  $L(r^?)$                  & $\dn$ & $L(r) \cup \{[]\}$\\
 | 
|
85  | 
  $L(r^{\{n\}})$             & $\dn$ & $L(r)^n$\\
 | 
|
86  | 
  $L(r^{\{..m\}})$           & $\dn$ & $\bigcup_{0\le i \le m}.\;L(r)^i$\\
 | 
|
87  | 
  $L(r^{\{n..\}})$           & $\dn$ & $\bigcup_{n\le i}.\;L(r)^i$\\
 | 
|
88  | 
  $L(r^{\{n..m\}})$          & $\dn$ & $\bigcup_{n\le i \le m}.\;L(r)^i$\\
 | 
|
89  | 
  $L(\sim{}r)$              & $\dn$ & $\Sigma^* - L(r)$
 | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
90  | 
\end{tabular}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
91  | 
\end{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
92  | 
|
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
93  | 
\noindent whereby in the last clause the set $\Sigma^*$ stands  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
94  | 
for the set of \emph{all} strings over the alphabet $\Sigma$
 | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
95  | 
(in the implementation the alphabet can be just what is  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
96  | 
represented by, say, the type \pcode{Char}). So $\sim{}r$
 | 
| 492 | 97  | 
means in effect ``all the strings that $r$ cannot match''.\medskip  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
98  | 
|
| 492 | 99  | 
\noindent  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
100  | 
Be careful that your implementation of \textit{nullable} and
 | 
| 492 | 101  | 
\textit{der} satisfies for every regular expression $r$ the following
 | 
| 545 | 102  | 
two properties (see also Question 3):  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
103  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
104  | 
\begin{itemize}
 | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
105  | 
\item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
 | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
106  | 
\item $L(der\,c\,r) = Der\,c\,(L(r))$  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
107  | 
\end{itemize}
 | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
108  | 
|
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
109  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
110  | 
|
| 512 | 111  | 
\subsection*{Question 1 (Unmarked)}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
112  | 
|
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
113  | 
What is your King's email address (you will need it in  | 
| 545 | 114  | 
Question 5)?  | 
115  | 
||
116  | 
\subsection*{Question 2 (Unmarked)}
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
117  | 
|
| 567 | 118  | 
Can you please list all programming languages in which you have  | 
119  | 
already written programs (like spent at least a good working day  | 
|
120  | 
working on the program)? This is just for my curiosity to estimate  | 
|
121  | 
what your background is.  | 
|
| 545 | 122  | 
|
123  | 
\subsection*{Question 3}
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
124  | 
|
| 473 | 125  | 
From the  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
126  | 
lectures you have seen the definitions for the functions  | 
| 
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
127  | 
\textit{nullable} and \textit{der} for the basic regular
 | 
| 567 | 128  | 
expressions. Implement and write down rules for the extended  | 
129  | 
regular expressions:  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
130  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
131  | 
\begin{center}
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
132  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
| 492 | 133  | 
  $\textit{nullable}([c_1,c_2,\ldots,c_n])$  & $\dn$ & $?$\\
 | 
134  | 
  $\textit{nullable}(r^+)$                   & $\dn$ & $?$\\
 | 
|
135  | 
  $\textit{nullable}(r^?)$                   & $\dn$ & $?$\\
 | 
|
136  | 
  $\textit{nullable}(r^{\{n\}})$              & $\dn$ & $?$\\
 | 
|
137  | 
  $\textit{nullable}(r^{\{..m\}})$            & $\dn$ & $?$\\
 | 
|
138  | 
  $\textit{nullable}(r^{\{n..\}})$            & $\dn$ & $?$\\
 | 
|
139  | 
  $\textit{nullable}(r^{\{n..m\}})$           & $\dn$ & $?$\\
 | 
|
140  | 
  $\textit{nullable}(\sim{}r)$              & $\dn$ & $?$
 | 
|
141  | 
\end{tabular}
 | 
|
142  | 
\end{center}
 | 
|
143  | 
||
144  | 
\begin{center}
 | 
|
145  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
|
146  | 
$der\, c\, ([c_1,c_2,\ldots,c_n])$ & $\dn$ & $?$\\  | 
|
147  | 
$der\, c\, (r^+)$ & $\dn$ & $?$\\  | 
|
148  | 
$der\, c\, (r^?)$ & $\dn$ & $?$\\  | 
|
149  | 
  $der\, c\, (r^{\{n\}})$              & $\dn$ & $?$\\
 | 
|
150  | 
  $der\, c\, (r^{\{..m\}})$           & $\dn$ & $?$\\
 | 
|
151  | 
  $der\, c\, (r^{\{n..\}})$           & $\dn$ & $?$\\
 | 
|
152  | 
  $der\, c\, (r^{\{n..m\}})$           & $\dn$ & $?$\\
 | 
|
153  | 
  $der\, c\, (\sim{}r)$               & $\dn$ & $?$\\
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
154  | 
\end{tabular}
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
155  | 
\end{center}
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
156  | 
|
| 
333
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
157  | 
\noindent  | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
158  | 
Remember your definitions have to satisfy the two properties  | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
159  | 
|
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
160  | 
\begin{itemize}
 | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
161  | 
\item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
 | 
| 
333
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
162  | 
\item $L(der\,c\,r)) = Der\,c\,(L(r))$  | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
163  | 
\end{itemize}
 | 
| 
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
164  | 
|
| 473 | 165  | 
\noindent  | 
| 492 | 166  | 
Given the definitions of \textit{nullable} and \textit{der}, it is
 | 
167  | 
easy to implement a regular expression matcher. Test your regular  | 
|
168  | 
expression matcher with (at least) the examples:  | 
|
169  | 
||
170  | 
||
171  | 
\begin{center}
 | 
|
172  | 
\def\arraystretch{1.2}  
 | 
|
173  | 
\begin{tabular}{r|m{12mm}|m{12mm}|m{12mm}|m{12mm}|m{12mm}|m{12mm}}
 | 
|
174  | 
  string & $a^{\{3\}}$ & $(a^?)^{\{3\}}$ & $a^{\{..3\}}$ &
 | 
|
175  | 
     $(a^?)^{\{..3\}}$ & $a^{\{3..5\}}$ & $(a^?)^{\{3..5\}}$\\\hline
 | 
|
176  | 
$[]$ &&&&&& \\\hline  | 
|
177  | 
  \texttt{a}     &&&&&& \\\hline 
 | 
|
178  | 
  \texttt{aa}    &&&&&& \\\hline 
 | 
|
179  | 
  \texttt{aaa}   &&&&&& \\\hline 
 | 
|
180  | 
  \texttt{aaaaa} &&&&&& \\\hline 
 | 
|
181  | 
  \texttt{aaaaaa}&&&&&& \\
 | 
|
182  | 
\end{tabular}
 | 
|
183  | 
\end{center}
 | 
|
184  | 
||
185  | 
\noindent  | 
|
186  | 
Does your matcher produce the expected results?  | 
|
| 473 | 187  | 
|
| 545 | 188  | 
\subsection*{Question 4}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
189  | 
|
| 494 | 190  | 
As you can see, there are a number of explicit regular expressions  | 
191  | 
that deal with single or several characters, for example:  | 
|
| 492 | 192  | 
|
193  | 
\begin{center}
 | 
|
194  | 
\begin{tabular}{ll}
 | 
|
| 494 | 195  | 
$c$ & matches a single character\\  | 
196  | 
$[c_1,c_2,\ldots,c_n]$ & matches a set of characters---for character ranges\\  | 
|
197  | 
  $\textit{ALL}$ & matches any character
 | 
|
| 492 | 198  | 
\end{tabular}
 | 
199  | 
\end{center}
 | 
|
200  | 
||
201  | 
\noindent  | 
|
| 567 | 202  | 
The latter is useful for matching any string (for example  | 
| 494 | 203  | 
by using $\textit{ALL}^*$). In order to avoid having an explicit constructor
 | 
204  | 
for each case, we can generalise all these cases and introduce a single  | 
|
| 492 | 205  | 
constructor $\textit{CFUN}(f)$ where $f$ is a function from characters
 | 
| 576 | 206  | 
to booleans. In Scala code this would look as follows:  | 
207  | 
||
208  | 
\begin{lstlisting}[numbers=none]
 | 
|
209  | 
abstract class Rexp  | 
|
210  | 
...  | 
|
211  | 
case class CFUN(f: Char => Boolean) extends Rexp  | 
|
212  | 
\end{lstlisting}\smallskip
 | 
|
213  | 
||
214  | 
\noindent  | 
|
215  | 
The idea is that the function $f$ determines which character(s)  | 
|
| 494 | 216  | 
are matched, namely those where $f$ returns \texttt{true}.
 | 
217  | 
In this question implement \textit{CFUN} and define
 | 
|
| 492 | 218  | 
|
219  | 
\begin{center}
 | 
|
220  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
|
221  | 
  $\textit{nullable}(\textit{CFUN}(f))$  & $\dn$ & $?$\\
 | 
|
222  | 
  $\textit{der}\,c\,(\textit{CFUN}(f))$  & $\dn$ & $?$
 | 
|
223  | 
\end{tabular}
 | 
|
224  | 
\end{center}
 | 
|
225  | 
||
| 494 | 226  | 
\noindent in your matcher and then also give definitions for  | 
| 492 | 227  | 
|
228  | 
\begin{center}
 | 
|
229  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
|
230  | 
  $c$  & $\dn$ & $\textit{CFUN}(?)$\\
 | 
|
231  | 
  $[c_1,c_2,\ldots,c_n]$  & $\dn$ & $\textit{CFUN}(?)$\\
 | 
|
232  | 
  $\textit{ALL}$  & $\dn$ & $\textit{CFUN}(?)$
 | 
|
233  | 
\end{tabular}
 | 
|
234  | 
\end{center}
 | 
|
235  | 
||
| 567 | 236  | 
\noindent  | 
237  | 
You can either add the constructor $CFUN$ to your implementation in  | 
|
238  | 
Question 3, or you can implement this questions first  | 
|
239  | 
and then use $CFUN$ instead of \code{RANGE} and \code{CHAR} in Question 3.
 | 
|
240  | 
||
| 492 | 241  | 
|
| 545 | 242  | 
\subsection*{Question 5}
 | 
| 492 | 243  | 
|
244  | 
Suppose $[a\mbox{-}z0\mbox{-}9\_\,.\mbox{-}]$ stands for the regular expression
 | 
|
245  | 
||
246  | 
\[[a,b,c,\ldots,z,0,\dots,9,\_,.,\mbox{-}]\;.\]
 | 
|
247  | 
||
248  | 
\noindent  | 
|
249  | 
Define in your code the following regular expression for email addresses  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
250  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
251  | 
\[  | 
| 492 | 252  | 
([a\mbox{-}z0\mbox{-}9\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
253  | 
\]  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
254  | 
|
| 567 | 255  | 
\noindent and calculate the derivative according to your own email  | 
| 
395
 
e57d3d92b856
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
256  | 
address. When calculating the derivative, simplify all regular  | 
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
257  | 
expressions as much as possible by applying the  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
258  | 
following 7 simplification rules:  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
259  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
260  | 
\begin{center}
 | 
| 
272
 
1446bc47a294
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
260 
diff
changeset
 | 
261  | 
\begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll}
 | 
| 
439
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
262  | 
$r \cdot \ZERO$ & $\mapsto$ & $\ZERO$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
263  | 
$\ZERO \cdot r$ & $\mapsto$ & $\ZERO$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
264  | 
$r \cdot \ONE$ & $\mapsto$ & $r$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
265  | 
$\ONE \cdot r$ & $\mapsto$ & $r$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
266  | 
$r + \ZERO$ & $\mapsto$ & $r$\\  | 
| 
 
7611ace6a93b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
418 
diff
changeset
 | 
267  | 
$\ZERO + r$ & $\mapsto$ & $r$\\  | 
| 
333
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
268  | 
$r + r$ & $\mapsto$ & $r$\\  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
269  | 
\end{tabular}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
270  | 
\end{center}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
271  | 
|
| 
418
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
272  | 
\noindent Write down your simplified derivative in a readable  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
273  | 
notation using parentheses where necessary. That means you  | 
| 
 
010c5a03dca2
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
274  | 
should use the infix notation $+$, $\cdot$, $^*$ and so on,  | 
| 567 | 275  | 
instead of raw code.\bigskip  | 
276  | 
||
277  | 
||
278  | 
\subsection*{Question 6}
 | 
|
279  | 
||
| 492 | 280  | 
Implement the simplification rules in your regular expression matcher.  | 
281  | 
Consider the regular expression $/ \cdot * \cdot  | 
|
282  | 
(\sim{}(\textit{ALL}^* \cdot * \cdot / \cdot \textit{ALL}^*)) \cdot *
 | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
283  | 
\cdot /$ and decide wether the following four strings are matched by  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
284  | 
this regular expression. Answer yes or no.  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
285  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
286  | 
\begin{enumerate}
 | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
287  | 
\item \texttt{"/**/"}
 | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
288  | 
\item \texttt{"/*foobar*/"}
 | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
289  | 
\item \texttt{"/*test*/test*/"}
 | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
290  | 
\item \texttt{"/*test/*test*/"}
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
291  | 
\end{enumerate}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
292  | 
|
| 567 | 293  | 
\subsection*{Question 7}
 | 
| 512 | 294  | 
|
295  | 
Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be  | 
|
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
296  | 
$(a^{\{19,19\}}) \cdot (a^?)$.  Decide whether the following three
 | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
297  | 
strings consisting of $a$s only can be matched by $(r_1^+)^+$.  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
298  | 
Similarly test them with $(r_2^+)^+$. Again answer in all six cases  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
299  | 
with yes or no. \medskip  | 
| 
130
 
5c4998375c46
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
129 
diff
changeset
 | 
300  | 
|
| 
 
5c4998375c46
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
129 
diff
changeset
 | 
301  | 
\noindent  | 
| 
259
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
302  | 
These are strings are meant to be entirely made up of $a$s. Be careful  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
303  | 
when copy-and-pasting the strings so as to not forgetting any $a$ and  | 
| 
 
e5f4b8ff23b8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
253 
diff
changeset
 | 
304  | 
to not introducing any other character.  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
305  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
306  | 
\begin{enumerate}
 | 
| 492 | 307  | 
\setcounter{enumi}{4}
 | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
308  | 
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
309  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\  | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
310  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}  | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
311  | 
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
312  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\  | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
313  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}  | 
| 
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
314  | 
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
 | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
315  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\  | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
133 
diff
changeset
 | 
316  | 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}  | 
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
317  | 
\end{enumerate}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
318  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
319  | 
|
| 492 | 320  | 
|
| 
127
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
321  | 
\end{document}
 | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
322  | 
|
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
323  | 
%%% Local Variables:  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
324  | 
%%% mode: latex  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
325  | 
%%% TeX-master: t  | 
| 
 
41ef073ac6c4
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
326  | 
%%% End:  |