coursework/cw01.tex
author Christian Urban <urbanc@in.tum.de>
Wed, 15 Nov 2017 00:17:15 +0000
changeset 533 1276d7013c2c
parent 512 a6aa52ecc1c5
child 545 76a98ed71a2a
permissions -rw-r--r--
update
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     1
\documentclass{article}
253
75c469893514 added coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
     2
\usepackage{../style}
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
     3
\usepackage{../langs}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     4
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
     5
\usepackage{array}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
     6
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
     7
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     8
\begin{document}
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
     9
\newcolumntype{C}[1]{>{\centering}m{#1}}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    10
260
65d1ea0e989f updated cws
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 259
diff changeset
    11
\section*{Coursework 1 (Strand 1)}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    12
499
dfd0f41f8668 updated
Christian Urban <urbanc@in.tum.de>
parents: 494
diff changeset
    13
This coursework is worth 4\% and is due on 19 October at
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    14
16:00. You are asked to implement a regular expression matcher
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    15
and submit a document containing the answers for the questions
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    16
below. You can do the implementation in any programming
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    17
language you like, but you need to submit the source code with
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    18
which you answered the questions, otherwise a mark of 0\% will
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
    19
be awarded. You can submit your answers in a txt-file or pdf.
418
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
    20
Code send as code.
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    21
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    22
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
    23
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
    24
\subsubsection*{Disclaimer}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    25
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
    26
It should be understood that the work you submit represents
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
    27
your own effort. You have not copied from anyone else. An
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
    28
exception is the Scala code I showed during the lectures or
418
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
    29
uploaded to KEATS, which you can freely use.\bigskip
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
    30
512
a6aa52ecc1c5 updated
cu
parents: 499
diff changeset
    31
\noindent
a6aa52ecc1c5 updated
cu
parents: 499
diff changeset
    32
If you have any questions, please send me an email in \textbf{good}
a6aa52ecc1c5 updated
cu
parents: 499
diff changeset
    33
time.\bigskip
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
    34
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    35
\subsection*{Task}
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
    36
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
    37
The task is to implement a regular expression matcher based on
418
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
    38
derivatives of regular expressions. The implementation should
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
    39
be able to deal with the usual (basic) regular expressions
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    40
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    41
\[
418
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
    42
\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    43
\]
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    44
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    45
\noindent
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
    46
but also with the following extended regular expressions:
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    47
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    48
\begin{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    49
\begin{tabular}{ll}
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    50
  $[c_1,c_2,\ldots,c_n]$ & a set of characters---for character ranges\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    51
  $r^+$ & one or more times $r$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    52
  $r^?$ & optional $r$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    53
  $r^{\{n\}}$ & exactly $n$-times\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    54
  $r^{\{..m\}}$ & zero or more times $r$ but no more than $m$-times\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    55
  $r^{\{n..\}}$ & at least $n$-times $r$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    56
  $r^{\{n..m\}}$ & at least $n$-times $r$ but no more than $m$-times\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    57
  $\sim{}r$ & not-regular-expression of $r$\\
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    58
\end{tabular}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    59
\end{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    60
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    61
\noindent You can assume that $n$ and $m$ are greater or equal than
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    62
$0$. In the case of $r^{\{n,m\}}$ you can also assume $0 \le n \le m$.\bigskip
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    63
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    64
\noindent {\bf Important!} Your implementation should have explicit
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
    65
cases for the basic regular expressions, but also for explicit cases for
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    66
the extended regular expressions. That means do not treat the extended
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    67
regular expressions by just translating them into the basic ones. See
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    68
also Question 2, where you are asked to explicitly give the rules for
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    69
\textit{nullable} and \textit{der} for the extended regular
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    70
expressions.\newpage
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    71
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    72
\noindent
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    73
The meanings of the extended regular expressions are
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    74
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    75
\begin{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    76
\begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l}
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    77
  $L([c_1,c_2,\ldots,c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ 
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    78
  $L(r^+)$                  & $\dn$ & $\bigcup_{1\le i}.\;L(r)^i$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    79
  $L(r^?)$                  & $\dn$ & $L(r) \cup \{[]\}$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    80
  $L(r^{\{n\}})$             & $\dn$ & $L(r)^n$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    81
  $L(r^{\{..m\}})$           & $\dn$ & $\bigcup_{0\le i \le m}.\;L(r)^i$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    82
  $L(r^{\{n..\}})$           & $\dn$ & $\bigcup_{n\le i}.\;L(r)^i$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    83
  $L(r^{\{n..m\}})$          & $\dn$ & $\bigcup_{n\le i \le m}.\;L(r)^i$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    84
  $L(\sim{}r)$              & $\dn$ & $\Sigma^* - L(r)$
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    85
\end{tabular}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    86
\end{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    87
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
    88
\noindent whereby in the last clause the set $\Sigma^*$ stands
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
    89
for the set of \emph{all} strings over the alphabet $\Sigma$
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
    90
(in the implementation the alphabet can be just what is
418
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
    91
represented by, say, the type \pcode{Char}). So $\sim{}r$
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    92
means in effect ``all the strings that $r$ cannot match''.\medskip 
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    93
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    94
\noindent
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
    95
Be careful that your implementation of \textit{nullable} and
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    96
\textit{der} satisfies for every regular expression $r$ the following
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
    97
two properties (see also Question 2):
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    98
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    99
\begin{itemize}
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
   100
\item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
   101
\item $L(der\,c\,r) = Der\,c\,(L(r))$
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   102
\end{itemize}
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   103
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   104
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   105
512
a6aa52ecc1c5 updated
cu
parents: 499
diff changeset
   106
\subsection*{Question 1 (Unmarked)}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   107
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
   108
What is your King's email address (you will need it in
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   109
Question 4)?
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   110
418
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   111
\subsection*{Question 2}
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   112
473
dc528091eb70 updated
Christian Urban <urbanc@in.tum.de>
parents: 456
diff changeset
   113
From the
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
   114
lectures you have seen the definitions for the functions
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
   115
\textit{nullable} and \textit{der} for the basic regular
473
dc528091eb70 updated
Christian Urban <urbanc@in.tum.de>
parents: 456
diff changeset
   116
expressions. Implement the rules for the extended regular
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
   117
expressions:
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   118
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   119
\begin{center}
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   120
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   121
  $\textit{nullable}([c_1,c_2,\ldots,c_n])$  & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   122
  $\textit{nullable}(r^+)$                   & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   123
  $\textit{nullable}(r^?)$                   & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   124
  $\textit{nullable}(r^{\{n\}})$              & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   125
  $\textit{nullable}(r^{\{..m\}})$            & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   126
  $\textit{nullable}(r^{\{n..\}})$            & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   127
  $\textit{nullable}(r^{\{n..m\}})$           & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   128
  $\textit{nullable}(\sim{}r)$              & $\dn$ & $?$
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   129
\end{tabular}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   130
\end{center}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   131
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   132
\begin{center}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   133
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   134
  $der\, c\, ([c_1,c_2,\ldots,c_n])$  & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   135
  $der\, c\, (r^+)$                   & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   136
  $der\, c\, (r^?)$                   & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   137
  $der\, c\, (r^{\{n\}})$              & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   138
  $der\, c\, (r^{\{..m\}})$           & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   139
  $der\, c\, (r^{\{n..\}})$           & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   140
  $der\, c\, (r^{\{n..m\}})$           & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   141
  $der\, c\, (\sim{}r)$               & $\dn$ & $?$\\
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   142
\end{tabular}
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   143
\end{center}
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   144
333
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   145
\noindent
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   146
Remember your definitions have to satisfy the two properties
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   147
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   148
\begin{itemize}
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
   149
\item $\textit{nullable}(r)$ if and only if $[]\in L(r)$
333
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   150
\item $L(der\,c\,r)) = Der\,c\,(L(r))$
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   151
\end{itemize}
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   152
473
dc528091eb70 updated
Christian Urban <urbanc@in.tum.de>
parents: 456
diff changeset
   153
\noindent
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   154
Given the definitions of \textit{nullable} and \textit{der}, it is
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   155
easy to implement a regular expression matcher.  Test your regular
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   156
expression matcher with (at least) the examples:
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   157
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   158
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   159
\begin{center}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   160
\def\arraystretch{1.2}  
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   161
\begin{tabular}{r|m{12mm}|m{12mm}|m{12mm}|m{12mm}|m{12mm}|m{12mm}}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   162
  string & $a^{\{3\}}$ & $(a^?)^{\{3\}}$ & $a^{\{..3\}}$ &
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   163
     $(a^?)^{\{..3\}}$ & $a^{\{3..5\}}$ & $(a^?)^{\{3..5\}}$\\\hline
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   164
  $[]$           &&&&&& \\\hline 
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   165
  \texttt{a}     &&&&&& \\\hline 
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   166
  \texttt{aa}    &&&&&& \\\hline 
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   167
  \texttt{aaa}   &&&&&& \\\hline 
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   168
  \texttt{aaaaa} &&&&&& \\\hline 
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   169
  \texttt{aaaaaa}&&&&&& \\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   170
\end{tabular}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   171
\end{center}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   172
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   173
\noindent
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   174
Does your matcher produce the expected results?
473
dc528091eb70 updated
Christian Urban <urbanc@in.tum.de>
parents: 456
diff changeset
   175
418
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   176
\subsection*{Question 3}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   177
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   178
As you can see, there are a number of explicit regular expressions
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   179
that deal with single or several characters, for example:
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   180
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   181
\begin{center}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   182
\begin{tabular}{ll}
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   183
  $c$ & matches a single character\\  
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   184
  $[c_1,c_2,\ldots,c_n]$ & matches a set of characters---for character ranges\\
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   185
  $\textit{ALL}$ & matches any character
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   186
\end{tabular}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   187
\end{center}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   188
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   189
\noindent
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   190
the latter is useful for matching any string (for example
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   191
by using $\textit{ALL}^*$). In order to avoid having an explicit constructor
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   192
for each case, we can generalise all these cases and introduce a single
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   193
constructor $\textit{CFUN}(f)$ where $f$ is a function from characters
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   194
to a boolean. The idea is that the function $f$ determines which character(s)
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   195
are matched, namely those where $f$ returns \texttt{true}.
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   196
In this question implement \textit{CFUN} and define
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   197
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   198
\begin{center}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   199
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   200
  $\textit{nullable}(\textit{CFUN}(f))$  & $\dn$ & $?$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   201
  $\textit{der}\,c\,(\textit{CFUN}(f))$  & $\dn$ & $?$
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   202
\end{tabular}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   203
\end{center}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   204
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 492
diff changeset
   205
\noindent in your matcher and then also give definitions for
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   206
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   207
\begin{center}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   208
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   209
  $c$  & $\dn$ & $\textit{CFUN}(?)$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   210
  $[c_1,c_2,\ldots,c_n]$  & $\dn$ & $\textit{CFUN}(?)$\\
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   211
  $\textit{ALL}$  & $\dn$ & $\textit{CFUN}(?)$
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   212
\end{tabular}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   213
\end{center}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   214
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   215
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   216
\subsection*{Question 4}
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   217
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   218
Suppose $[a\mbox{-}z0\mbox{-}9\_\,.\mbox{-}]$ stands for the regular expression
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   219
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   220
\[[a,b,c,\ldots,z,0,\dots,9,\_,.,\mbox{-}]\;.\]
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   221
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   222
\noindent
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   223
Define in your code the following regular expression for email addresses
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   224
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   225
\[
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   226
([a\mbox{-}z0\mbox{-}9\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}})
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   227
\]
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   228
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
   229
\noindent and calculate the derivative according to your email
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
   230
address. When calculating the derivative, simplify all regular
418
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   231
expressions as much as possible by applying the
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   232
following 7 simplification rules:
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   233
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   234
\begin{center}
272
1446bc47a294 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 260
diff changeset
   235
\begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll}
439
7611ace6a93b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 418
diff changeset
   236
$r \cdot \ZERO$ & $\mapsto$ & $\ZERO$\\ 
7611ace6a93b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 418
diff changeset
   237
$\ZERO \cdot r$ & $\mapsto$ & $\ZERO$\\ 
7611ace6a93b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 418
diff changeset
   238
$r \cdot \ONE$ & $\mapsto$ & $r$\\ 
7611ace6a93b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 418
diff changeset
   239
$\ONE \cdot r$ & $\mapsto$ & $r$\\ 
7611ace6a93b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 418
diff changeset
   240
$r + \ZERO$ & $\mapsto$ & $r$\\ 
7611ace6a93b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 418
diff changeset
   241
$\ZERO + r$ & $\mapsto$ & $r$\\ 
333
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   242
$r + r$ & $\mapsto$ & $r$\\ 
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   243
\end{tabular}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   244
\end{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   245
418
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   246
\noindent Write down your simplified derivative in a readable
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   247
notation using parentheses where necessary. That means you
010c5a03dca2 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   248
should use the infix notation $+$, $\cdot$, $^*$ and so on,
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   249
instead of code.\bigskip
395
e57d3d92b856 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
   250
 
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   251
\noindent
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   252
Implement the simplification rules in your regular expression matcher.
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   253
Consider the regular expression $/ \cdot * \cdot
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   254
(\sim{}(\textit{ALL}^* \cdot * \cdot / \cdot \textit{ALL}^*)) \cdot *
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   255
\cdot /$ and decide wether the following four strings are matched by
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   256
this regular expression. Answer yes or no.
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   257
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   258
\begin{enumerate}
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   259
\item \texttt{"/**/"}
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   260
\item \texttt{"/*foobar*/"}
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   261
\item \texttt{"/*test*/test*/"}
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   262
\item \texttt{"/*test/*test*/"}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   263
\end{enumerate}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   264
512
a6aa52ecc1c5 updated
cu
parents: 499
diff changeset
   265
\subsection*{Question 5}
a6aa52ecc1c5 updated
cu
parents: 499
diff changeset
   266
a6aa52ecc1c5 updated
cu
parents: 499
diff changeset
   267
Let $r_1$ be the regular expression $a\cdot a\cdot a$ and $r_2$ be
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   268
$(a^{\{19,19\}}) \cdot (a^?)$.  Decide whether the following three
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   269
strings consisting of $a$s only can be matched by $(r_1^+)^+$.
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   270
Similarly test them with $(r_2^+)^+$. Again answer in all six cases
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   271
with yes or no. \medskip
130
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 129
diff changeset
   272
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 129
diff changeset
   273
\noindent
259
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   274
These are strings are meant to be entirely made up of $a$s. Be careful
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   275
when copy-and-pasting the strings so as to not forgetting any $a$ and
e5f4b8ff23b8 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 253
diff changeset
   276
to not introducing any other character.
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   277
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   278
\begin{enumerate}
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   279
\setcounter{enumi}{4}
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   280
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   281
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   282
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   283
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   284
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   285
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   286
\item \texttt{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   287
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\ 
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 133
diff changeset
   288
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   289
\end{enumerate}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   290
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   291
492
39b7ff2cf1bc updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   292
127
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   293
\end{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   294
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   295
%%% Local Variables: 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   296
%%% mode: latex
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   297
%%% TeX-master: t
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   298
%%% End: