--- a/cws/cw03.tex Fri Nov 17 14:11:58 2017 +0000
+++ b/cws/cw03.tex Tue Nov 21 16:31:11 2017 +0000
@@ -8,22 +8,39 @@
This coursework is worth 10\%. It is about regular expressions,
pattern matching and polymorphism. The first part is due on 30
-November at 11pm; the second, more advanced part, is due on 7 December
-at 11pm. You are asked to implement a regular expression matcher. Make
-sure the files you submit can be processed by just calling
-\texttt{scala <<filename.scala>>}.\bigskip
+November at 11pm; the second, more advanced part, is due on 21
+December at 11pm. You are asked to implement a regular expression
+matcher based on derivatives of regular expressions. The reason is
+that regular expression matching in Java can be extremely slow
+sometimes.\bigskip
\noindent
-\textbf{Important:} Do not use any mutable data structures in your
-submission! They are not needed. This menas you cannot use
-\texttt{ListBuffer}s, for example. Do not use \texttt{return} in your
-code! It has a different meaning in Scala, than in Java. Do not use
-\texttt{var}! This declares a mutable variable. Make sure the
-functions you submit are defined on the ``top-level'' of Scala, not
-inside a class or object. Also note that the running time of
-each part will be restricted to a maximum of 360 seconds on my
-laptop.
+\textbf{Important:}
+
+\begin{itemize}
+\item Make sure the files you submit can be processed by just calling\\
+ \mbox{\texttt{scala <<filename.scala>>}} on the commandline. Use the
+ template files provided and do not make any changes to arguments of
+ functions or to any types. You are free to implement any auxiliary
+ function you might need.
+
+\item Do not use any mutable data structures in your
+submissions! They are not needed. This means you cannot create new
+\texttt{Array}s or \texttt{ListBuffer}s, for example.
+\item Do not use \texttt{return} in your code! It has a different
+ meaning in Scala, than in Java.
+
+\item Do not use \texttt{var}! This declares a mutable variable. Only
+ use \texttt{val}!
+
+\item Do not use any parallel collections! No \texttt{.par} therefore!
+ Our testing and marking infrastructure is not set up for it.
+\end{itemize}
+
+\noindent
+Also note that the running time of each part will be restricted to a
+maximum of 360 seconds on my laptop
\subsection*{Disclaimer}
@@ -54,12 +71,13 @@
\end{center}
\noindent
-Why? Knowing how to match regular expressions and strings will
-let you solve a lot of problems that vex other humans. Regular
-expressions are one of the fastest and simplest ways to match patterns
-in text, and are endlessly useful for searching, editing and
-analysing text in all sorts of places. However, you need to be
-fast, otherwise you will stumble over problems such as recently reported at
+Why? Knowing how to match regular expressions and strings will let you
+solve a lot of problems that vex other humans. Regular expressions are
+one of the fastest and simplest ways to match patterns in text, and
+are endlessly useful for searching, editing and analysing data in all
+sorts of places (for example analysing network traffic in order to
+detect security breaches). However, you need to be fast, otherwise you
+will stumble over problems such as recently reported at
{\small
\begin{itemize}
@@ -71,9 +89,10 @@
\subsubsection*{Tasks (file re.scala)}
\begin{itemize}
-\item[(1a)] Implement a function, called \textit{nullable}, by recursion over
- regular expressions. This function tests whether a regular expression can match
- the empty string.
+\item[(1a)] Implement a function, called \textit{nullable}, by
+ recursion over regular expressions. This function tests whether a
+ regular expression can match the empty string, that is given a
+ regular expression it either returns true or false.
\begin{center}
\begin{tabular}{lcl}
@@ -134,7 +153,8 @@
\begin{tabular}{lcll}
$\textit{der}\;a\;r''$ & $=$ & $((\ZERO \cdot b) + \ZERO) \cdot c + \ZERO$ \\
$\textit{der}\;b\;r''$ & $=$ & $((\ZERO \cdot b) + \ZERO) \cdot c + \ZERO$\\
- $\textit{der}\;c\;r''$ & $=$ & $((\ZERO \cdot b) + \ZERO) \cdot c + \ONE$
+ $\textit{der}\;c\;r''$ & $=$ & $((\ZERO \cdot b) + \ZERO) \cdot c + \ONE$ &
+ (is $\textit{nullable}$)
\end{tabular}
\end{center}
@@ -165,12 +185,12 @@
simplifies to just $r_1$. \textbf{Hints:} Regular expressions can be
seen as trees and there are several methods for traversing
trees. One of them corresponds to the inside-out traversal. Also
- remember numerical expressions from school: there you had exprssions
- like $u + \ldots + (1 \cdot x) * \ldots (z + (y \cdot 0)) \ldots$
+ remember numerical expressions from school: there you had expressions
+ like $u + \ldots + (1 \cdot x) - \ldots (z + (y \cdot 0)) \ldots$
and simplification rules that looked very similar to rules
above. You would simplify such numerical expressions by replacing
for example the $y \cdot 0$ by $0$, or $1\cdot x$ by $x$, and then
- look if more rules are applicable. If you organise this
+ look whether more rules are applicable. If you organise the
simplification in an inside-out fashion, it is always clear which
rule should applied next.\\\mbox{}\hfill[1 Mark]
@@ -333,7 +353,7 @@
Although easily implementable in Scala, the idea behind the derivative
function might not so easy to be seen. To understand its purpose
better, assume a regular expression $r$ can match strings of the form
-$c::cs$ (that means strings which start with a character $c$ and have
+$c\!::\!cs$ (that means strings which start with a character $c$ and have
some rest, or tail, $cs$). If you now take the derivative of $r$ with
respect to the character $c$, then you obtain a regular expressions
that can match all the strings $cs$. In other words, the regular