coursework/cw01.tex
author Christian Urban <christian dot urban at kcl dot ac dot uk>
Sun, 21 Sep 2014 17:40:04 +0100
changeset 253 75c469893514
parent 216 f5ec7c597c5b
child 259 e5f4b8ff23b8
permissions -rw-r--r--
added coursework
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     1
\documentclass{article}
253
75c469893514 added coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
     2
\usepackage{../style}
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
     3
\usepackage{../langs}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     4
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     5
\begin{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     6
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     7
\section*{Coursework 1}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     8
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     9
This coursework is worth 3\% and is due on 12 November at 16:00. You are asked to implement 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    10
a regular expression matcher and submit a document containing the answers for the questions 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    11
below. You can do the implementation in any programming language you like, but you need 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    12
to submit the source code with which you answered the questions. However, the coursework 
130
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 129
diff changeset
    13
will \emph{only} be judged according to the answers. You can submit your answers
131
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 130
diff changeset
    14
in a txt-file or pdf.\bigskip
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    15
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    16
\noindent
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    17
The task is to implement a regular expression matcher based on derivatives. The implementation 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    18
should be able to deal with the usual regular expressions
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    19
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    20
\[
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    21
\varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    22
\]
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    23
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    24
\noindent
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    25
but also with
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    26
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    27
\begin{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    28
\begin{tabular}{ll}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    29
$[c_1 c_2 \ldots c_n]$ & a range of characters\\
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    30
$r^+$ & one or more times $r$\\
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    31
$r^?$ & optional $r$\\
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    32
$r^{\{n,m\}}$ & at least $n$-times $r$ but no more than $m$-times\\
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    33
$\sim{}r$ & not-regular expression of $r$\\
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    34
\end{tabular}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    35
\end{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    36
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    37
\noindent
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    38
In the case of $r^{\{n,m\}}$ we have the convention that $0 \le n \le m$.
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    39
The meaning of these regular expressions is
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    40
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    41
\begin{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    42
\begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    43
$L([c_1 c_2 \ldots c_n])$ & $\dn$ & $\{"c_1", "c_2", \ldots, "c_n"\}$\\ 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    44
$L(r^+)$            & $\dn$ & $\bigcup_{1\le i}. L(r)^i$\\
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    45
$L(r^?)$            & $\dn$ & $L(r) \cup \{""\}$\\
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    46
$L(r^{\{n,m\}})$ & $\dn$ & $\bigcup_{n\le i \le m}. L(r)^i$\\
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    47
$L(\sim{}r)$       & $\dn$ & $UNIV - L(r)$
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    48
\end{tabular}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    49
\end{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    50
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    51
\noindent
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    52
whereby in the last clause the set $UNIV$ stands for the set of \emph{all} strings.
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    53
So $\sim{}r$ means `all the strings that $r$ cannot match'. We assume ranges 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    54
like $[a\mbox{-}z0\mbox{-}9]$ are a shorthand for the regular expression
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    55
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    56
\[
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    57
[a b c d\ldots z 0 1\ldots 9]\;.
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    58
\]
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    59
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    60
\noindent 
130
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 129
diff changeset
    61
Be careful that your implementation of $nullable$ and $der$ satisfies for every $r$ the following two
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    62
properties:
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    63
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    64
\begin{itemize}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    65
\item $nullable(r)$ if and only if $""\in L(r)$
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    66
\item $L(der\,c\,r)) = Der\,c\,(L(r))$
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    67
\end{itemize}
128
44863a6b468a updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 127
diff changeset
    68
\newpage
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    69
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    70
\subsection*{Question 1 (unmarked)}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    71
129
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 128
diff changeset
    72
What is your King's email address (you will need it in the next question)?\bigskip 
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    73
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    74
\subsection*{Question 2 (marked with 1\%)}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    75
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    76
Implement the following regular expression for email addresses
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    77
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    78
\[
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    79
([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    80
\]
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    81
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    82
\noindent
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    83
and calculate the derivative according to your email address. When calculating
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    84
the derivative, simplify all regular expressions as much as possible, but at least apply the following 
130
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 129
diff changeset
    85
six simplification rules:
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    86
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    87
\begin{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    88
\begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}l}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    89
$r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    90
$\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    91
$r \cdot \epsilon$ & $\mapsto$ & $r$\\ 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    92
$\epsilon \cdot r$ & $\mapsto$ & $r$\\ 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    93
$r + \varnothing$ & $\mapsto$ & $r$\\ 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    94
$\varnothing + r$ & $\mapsto$ & $r$\\ 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    95
\end{tabular}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    96
\end{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    97
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    98
\noindent
130
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 129
diff changeset
    99
Write down your simplified derivative in the ``mathematicical'' notation using parentheses where necessary.
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   100
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   101
\subsection*{Question 3 (marked with 1\%)}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   102
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   103
Consider the regular expression $/ \cdot * \cdot (\sim{}([a\mbox{-}z]^* \cdot * \cdot / \cdot [a\mbox{-}z]^*)) \cdot * \cdot /$ and decide
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   104
wether the following four strings are matched by this regular expression. Answer yes or no.
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   105
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   106
\begin{enumerate}
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   107
\item \texttt{"/**/"}
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   108
\item \texttt{"/*foobar*/"}
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   109
\item \texttt{"/*test*/test*/"}
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   110
\item \texttt{"/*test/*test*/"}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   111
\end{enumerate}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   112
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   113
\subsection*{Question 4 (marked with 1\%)}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   114
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   115
Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be $(a^{\{19,19\}}) \cdot (a^?)$.
128
44863a6b468a updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 127
diff changeset
   116
Decide whether the following three strings consisting of $a$s only can be matched by $(r_1^+)^+$. 
130
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 129
diff changeset
   117
Similarly test them with $(r_2^+)^+$. Again answer in all six cases with yes or no. \medskip
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 129
diff changeset
   118
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 129
diff changeset
   119
\noindent
133
09efdf5cf07c updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 131
diff changeset
   120
These are strings are meant to be entirely made up of $a$s. Be careful when 
128
44863a6b468a updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 127
diff changeset
   121
copy-and-pasting the strings so as to not forgetting any $a$ and to not introducing any
44863a6b468a updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 127
diff changeset
   122
other character.
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   123
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   124
\begin{enumerate}
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   125
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   126
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   127
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   128
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   129
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   130
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   131
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   132
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   133
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   134
\end{enumerate}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   135
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   136
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   137
\end{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   138
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   139
%%% Local Variables: 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   140
%%% mode: latex
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   141
%%% TeX-master: t
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   142
%%% End: