--- a/handouts/ho01.tex Wed Jun 01 13:04:06 2016 +0100
+++ b/handouts/ho01.tex Tue Jun 14 11:41:48 2016 +0100
@@ -34,7 +34,7 @@
This module is about text processing, be it for web-crawlers,
compilers, dictionaries, DNA-data and so on. When looking for
-a particular string in a large text we can use the
+a particular string, like $abc$ in a large text we can use the
Knuth-Morris-Pratt algorithm, which is currently the most
efficient general string search algorithm. But often we do
\emph{not} just look for a particular string, but for string
@@ -67,14 +67,14 @@
\noindent where the first part matches one or more lowercase
letters (\pcode{a-z}), digits (\pcode{0-9}), underscores, dots
-or hyphens. The \pcode{+} ensures the ``one or more''. Then
-comes the \pcode{@}-sign, followed by the domain name which
-must be one or more lowercase letters, digits, underscores,
-dots or hyphens. Note there cannot be an underscore in the
-domain name. Finally there must be a dot followed by the
-toplevel domain. This toplevel domain must be 2 to 6 lowercase
-letters including the dot. Example strings which follow this
-pattern are:
+and hyphens. The \pcode{+} at the end of the brackets ensures
+the ``one or more''. Then comes the \pcode{@}-sign, followed
+by the domain name which must be one or more lowercase
+letters, digits, underscores, dots or hyphens. Note there
+cannot be an underscore in the domain name. Finally there must
+be a dot followed by the toplevel domain. This toplevel domain
+must be 2 to 6 lowercase letters including the dot. Example
+strings which follow this pattern are:
\begin{lstlisting}[language={},numbers=none,keywordstyle=\color{black}]
niceandsimple@example.org
@@ -113,10 +113,11 @@
are \pcode{x}, \pcode{foo}, \pcode{foo_bar_1}, \pcode{A_very_42_long_object_name},
but not \pcode{_i} and also not \pcode{4you}.
-Many programming language offer libraries that can be used to
+Many programming languages offer libraries that can be used to
validate such strings against regular expressions. Also there
are some common, and I am sure very familiar, ways of how to
-construct regular expressions. For example in Scala we have:
+construct regular expressions. For example in Scala we have
+a library implementing the following regular expressions:
\begin{center}
\begin{tabular}{lp{9cm}}
@@ -177,10 +178,11 @@
Regular expressions were introduced by Kleene in the 1950ies
and they have been object of intense study since then. They
-are nowadays pretty much ubiquitous in computer science. I am
-sure you have come across them before. Why on earth then is
-there any interest in studying them again in depth in this
-module? Well, one answer is in the following graph about
+are nowadays pretty much ubiquitous in computer science. There
+are many libraries implementing regular expressions. I am sure
+you have come across them before (remember PRA?). Why on earth
+then is there any interest in studying them again in depth in
+this module? Well, one answer is in the following graph about
regular expression matching in Python and in Ruby.
\begin{center}
@@ -214,7 +216,7 @@
Ruby is even slightly worse.\footnote{In this example Ruby
uses the slightly different regular expression
\texttt{a?a?a?...a?a?aaa...aa}, where the \texttt{a?} and
-\texttt{a} each occur $n$ times. More test results can be
+\texttt{a} each occur $n$ times. More such test cases can be
found at \url{http://www.computerbytesman.com/redos/}.}
Admittedly, this regular expression is carefully chosen to
exhibit this exponential behaviour, but similar ones occur
@@ -285,7 +287,7 @@
\noindent Because we overload our notation, there are some
subtleties you should be aware of. When regular expressions
-are referred to then $\ZERO$ (in bold font) does not stand for
+are referred to, then $\ZERO$ (in bold font) does not stand for
the number zero: rather it is a particular pattern that does
not match any string. Similarly, in the context of regular
expressions, $\ONE$ does not stand for the number one but for
@@ -368,7 +370,7 @@
If you prefer to think in terms of the implementation
of regular expressions in Scala, the constructors and
classes relate as follows\footnote{More about Scala is
-in the handout about A Crash-Course on Scala.}
+in the handout about \emph{A Crash-Course on Scala}.}
\begin{center}
\begin{tabular}{rcl}
@@ -663,7 +665,8 @@
\end{minipage}
\end{center}
-\caption{Nothing that can be said this\ldots\label{monster}}
+\caption{Nothing that can be said about this regular
+expression\ldots\label{monster}}
\end{figure}