updated programs
authorChristian Urban <christian dot urban at kcl dot ac dot uk>
Fri, 26 Sep 2014 14:06:55 +0100
changeset 258 1e4da6d2490c
parent 257 70c307641d05
child 259 e5f4b8ff23b8
updated programs
handouts/ho01.pdf
handouts/ho01.tex
handouts/ho02.pdf
handouts/ho02.tex
hws/hw01.pdf
hws/hw01.tex
hws/hw02.pdf
hws/hw02.tex
hws/hw03.pdf
hws/hw03.tex
progs/re1.scala
progs/re2.scala
progs/re3.scala
progs/scraper.scala
slides/slides02.pdf
slides/slides02.tex
Binary file handouts/ho01.pdf has changed
--- a/handouts/ho01.tex	Mon Sep 22 13:42:14 2014 +0100
+++ b/handouts/ho01.tex	Fri Sep 26 14:06:55 2014 +0100
@@ -160,7 +160,7 @@
      \draw (1pt,\y) -- (-3pt,\y) node[anchor=east] {\y}; 
 	%labels      
 	\node[below=0.6cm] at (x axis mid) {number of \texttt{a}s};
-	\node[rotate=90,above=0.8cm] at (y axis mid) {time in secs};
+	\node[rotate=90,left=0.9cm] at (y axis mid) {time in secs};
 	%plots
 	\draw[color=blue] plot[mark=*] 
 		file {re-python.data};
@@ -222,7 +222,7 @@
      	\draw (1pt,\y) -- (-3pt,\y) node[anchor=east] {\y}; 
 	%labels      
 	\node[below=0.6cm] at (x axis mid) {number of \texttt{a}s};
-	\node[rotate=90, above=0.8cm] at (y axis mid) {time in secs};
+	\node[rotate=90,left=0.9cm] at (y axis mid) {time in secs};
 
 	%plots
    \draw[color=green] plot[mark=square*, mark options={fill=white} ] 
@@ -247,7 +247,7 @@
 \begin{center}
 \begin{tabular}{r@{\hspace{1mm}}r@{\hspace{1mm}}l@{\hspace{13mm}}l}
   $r$ & $::=$ &   $\varnothing$         & null\\
-        & $\mid$ & $\epsilon$           & empty string / "" / []\\
+        & $\mid$ & $\epsilon$           & empty string / \texttt{""} / []\\
         & $\mid$ & $c$                  & single character\\
         & $\mid$ & $r_1 + r_2$          & alternative / choice\\
         & $\mid$ & $r_1 \cdot r_2$      & sequence\\
@@ -444,7 +444,7 @@
 \varnothing^* \equiv \epsilon^*
 \]
 
-\noindent holds. Such equivalences will be important for our
+\noindent holds? Such equivalences will be important for our
 matching algorithm, because we can use them to simplify 
 regular expressions. 
 
Binary file handouts/ho02.pdf has changed
--- a/handouts/ho02.tex	Mon Sep 22 13:42:14 2014 +0100
+++ b/handouts/ho02.tex	Fri Sep 26 14:06:55 2014 +0100
@@ -7,10 +7,9 @@
 
 \section*{Handout 2}
 
-Having specified what problem our matching algorithm,
-\pcode{match}, is supposed to solve, namely for a given
-regular expression $r$ and string $s$ answer \textit{true} if
-and only if
+Having specified what problem our matching algorithm, \pcode{match},
+is supposed to solve, namely for a given regular expression $r$ and
+string $s$ answer \textit{true} if and only if
 
 \[
 s \in L(r)
@@ -21,10 +20,18 @@
 because in general the set of strings $L$ returns is infinite
 (recall what $L(a^*)$ is). In such cases there is no way we
 can implement an exhaustive test for whether a string is
-member of this set or not.
+member of this set or not. Before we come to the matching 
+algorithm, lets have a closer look at what it means when
+two regular expressions are equivalent.
+
+\subsection*{Regular Expression Equivalences}
 
-The algorithm we will define below consists of two parts. One is the function $nullable$ which takes a
-regular expression as argument and decides whether it can match the empty string (this means it returns a 
+
+\subsection*{Matching Algorithm}
+
+The algorithm we will define below consists of two parts. One is the
+function $nullable$ which takes a regular expression as argument and
+decides whether it can match the empty string (this means it returns a
 boolean). This can be easily defined recursively as follows:
 
 \begin{center}
@@ -46,13 +53,17 @@
 \]
 
 \noindent
-Note on the left-hand side we have a function we can implement; on the right we have its specification. 
+Note on the left-hand side we have a function we can implement; on the
+right we have its specification.
 
-The other function of our matching algorithm calculates a \emph{derivative} of a regular expression. This is a function
-which will take a regular expression, say $r$, and a character, say $c$, as argument and return 
-a new regular expression. Be careful that the intuition behind this function is not so easy to grasp on first
-reading. Essentially this function solves the following problem: if $r$ can match a string of the form
-$c\!::\!s$, what does the regular expression look like that can match just $s$. The definition of this
+The other function of our matching algorithm calculates a
+\emph{derivative} of a regular expression. This is a function which
+will take a regular expression, say $r$, and a character, say $c$, as
+argument and return a new regular expression. Be careful that the
+intuition behind this function is not so easy to grasp on first
+reading. Essentially this function solves the following problem: if
+$r$ can match a string of the form $c\!::\!s$, what does the regular
+expression look like that can match just $s$. The definition of this
 function is as follows:
 
 \begin{center}
@@ -69,33 +80,45 @@
 \end{center}
 
 \noindent
-The first two clauses can be rationalised as follows: recall that $der$ should calculate a regular
-expression, if the ``input'' regular expression can match a string of the form $c\!::\!s$. Since neither
-$\varnothing$ nor $\epsilon$ can match such a string we return $\varnothing$. In the third case
-we have to make a case-distinction: In case the regular expression is $c$, then clearly it can recognise
-a string of the form $c\!::\!s$, just that $s$ is the empty string. Therefore we return the $\epsilon$-regular 
-expression. In the other case we again return $\varnothing$ since no string of the $c\!::\!s$ can be matched.
-The $+$-case is relatively straightforward: all strings of the form $c\!::\!s$ are either matched by the
-regular expression $r_1$ or $r_2$. So we just have to recursively call $der$ with these two regular
-expressions and compose the results again with $+$. The $\cdot$-case is more complicated:
-if $r_1\cdot r_2$ matches a string of the form $c\!::\!s$, then the first part must be matched by $r_1$.
-Consequently, it makes sense to construct the regular expression for $s$ by calling $der$ with $r_1$ and
-``appending'' $r_2$. There is however one exception to this simple rule: if $r_1$ can match the empty
-string, then all of $c\!::\!s$ is matched by $r_2$. So in case $r_1$ is nullable (that is can match the
-empty string) we have to allow the choice $der\,c\,r_2$ for calculating the regular expression that can match 
-$s$. The $*$-case is again simple: if $r^*$ matches a string of the form $c\!::\!s$, then the first part must be
-``matched'' by a single copy of $r$. Therefore we call recursively $der\,c\,r$ and ``append'' $r^*$ in order to 
-match the rest of $s$.
+The first two clauses can be rationalised as follows: recall that
+$der$ should calculate a regular expression, if the ``input'' regular
+expression can match a string of the form $c\!::\!s$. Since neither
+$\varnothing$ nor $\epsilon$ can match such a string we return
+$\varnothing$. In the third case we have to make a case-distinction:
+In case the regular expression is $c$, then clearly it can recognise a
+string of the form $c\!::\!s$, just that $s$ is the empty
+string. Therefore we return the $\epsilon$-regular expression. In the
+other case we again return $\varnothing$ since no string of the
+$c\!::\!s$ can be matched.  The $+$-case is relatively
+straightforward: all strings of the form $c\!::\!s$ are either matched
+by the regular expression $r_1$ or $r_2$. So we just have to
+recursively call $der$ with these two regular expressions and compose
+the results again with $+$. The $\cdot$-case is more complicated: if
+$r_1\cdot r_2$ matches a string of the form $c\!::\!s$, then the first
+part must be matched by $r_1$.  Consequently, it makes sense to
+construct the regular expression for $s$ by calling $der$ with $r_1$
+and ``appending'' $r_2$. There is however one exception to this simple
+rule: if $r_1$ can match the empty string, then all of $c\!::\!s$ is
+matched by $r_2$. So in case $r_1$ is nullable (that is can match the
+empty string) we have to allow the choice $der\,c\,r_2$ for
+calculating the regular expression that can match $s$. The $*$-case is
+again simple: if $r^*$ matches a string of the form $c\!::\!s$, then
+the first part must be ``matched'' by a single copy of $r$. Therefore
+we call recursively $der\,c\,r$ and ``append'' $r^*$ in order to match
+the rest of $s$.
 
-Another way to rationalise the definition of $der$ is to consider the following operation on sets:
+Another way to rationalise the definition of $der$ is to consider the
+following operation on sets:
 
 \[
 Der\,c\,A\;\dn\;\{s\,|\,c\!::\!s \in A\}
 \]
 
 \noindent
-which essentially transforms a set of strings $A$ by filtering out all strings that do not start with $c$ and then
-strips off the $c$ from all the remaining strings. For example suppose $A = \{"f\!oo", "bar", "f\!rak"\}$ then
+which essentially transforms a set of strings $A$ by filtering out all
+strings that do not start with $c$ and then strips off the $c$ from
+all the remaining strings. For example suppose $A = \{"f\!oo", "bar",
+"f\!rak"\}$ then
 \[
 Der\,f\,A = \{"oo", "rak"\}\quad,\quad
 Der\,b\,A = \{"ar"\}  \quad \text{and} \quad
@@ -103,20 +126,23 @@
 \]
  
 \noindent
-Note that in the last case $Der$ is empty, because no string in $A$ starts with $a$. With this operation we can
-state the following property about $der$:
+Note that in the last case $Der$ is empty, because no string in $A$
+starts with $a$. With this operation we can state the following
+property about $der$:
 
 \[
 L(der\,c\,r) = Der\,c\,(L(r))
 \]
 
 \noindent
-This property clarifies what regular expression $der$ calculates, namely take the set of strings
-that $r$ can match (that is $L(r)$), filter out all strings not starting with $c$ and strip off the $c$ from the
-remaining strings---this is exactly the language that $der\,c\,r$ can match.
+This property clarifies what regular expression $der$ calculates,
+namely take the set of strings that $r$ can match (that is $L(r)$),
+filter out all strings not starting with $c$ and strip off the $c$
+from the remaining strings---this is exactly the language that
+$der\,c\,r$ can match.
 
-If we want to find out whether the string $"abc"$ is matched by the regular expression $r$
-then we can iteratively apply $Der$ as follows
+If we want to find out whether the string $"abc"$ is matched by the
+regular expression $r$ then we can iteratively apply $Der$ as follows
 
 \begin{enumerate}
 \item $Der\,a\,(L(r))$
@@ -125,10 +151,12 @@
 \end{enumerate}
 
 \noindent
-In the last step we need to test whether the empty string is in the set. Our matching algorithm will work similarly, 
-just using regular expression instead of sets. For this we need to lift the notion of derivatives from characters to strings. This can be
-done using the following function, taking a string and regular expression as input and a regular expression 
-as output.
+In the last step we need to test whether the empty string is in the
+set. Our matching algorithm will work similarly, just using regular
+expression instead of sets. For this we need to lift the notion of
+derivatives from characters to strings. This can be done using the
+following function, taking a string and regular expression as input
+and a regular expression as output.
 
 \begin{center}
 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {\hspace{-10mm}}l@ {}}
@@ -152,9 +180,15 @@
 \]
 
 \noindent
-holds, which means our algorithm satisfies the specification. This algorithm was introduced by
-Janus Brzozowski in 1964. Its main attractions are simplicity and being fast, as well as
-being easily extendable for other regular expressions such as $r^{\{n\}}$, $r^?$, $\sim{}r$ and so on.
+holds, which means our algorithm satisfies the specification. This
+algorithm was introduced by Janus Brzozowski in 1964. Its main
+attractions are simplicity and being fast, as well as being easily
+extendable for other regular expressions such as $r^{\{n\}}$, $r^?$,
+$\sim{}r$ and so on.
+
+\subsection*{The Matching Algorithm in Scala}
+
+
 
 \begin{figure}[p]
 {\lstset{language=Scala}\texttt{\lstinputlisting{../progs/app5.scala}}}
Binary file hws/hw01.pdf has changed
--- a/hws/hw01.tex	Mon Sep 22 13:42:14 2014 +0100
+++ b/hws/hw01.tex	Fri Sep 26 14:06:55 2014 +0100
@@ -37,7 +37,8 @@
 
 \item Assume the concatenation operation of two strings is
       written as $s_1 @ s_2$. Define the operation of
-      \emph{concatenating} two sets of strings.
+      \emph{concatenating}, written $\_ \,@\, \_$ two sets of
+      strings.
 
 \item Assume a set $A$ contains 4 strings and a set $B$ 7
       strings, how many strings are in $A @ B$?
@@ -47,14 +48,14 @@
       $\_^{n+1}$.)
 
 \item How many regular expressions are there to match the
-      string $abc$? (How many if they cannot include
+      string $abc$? How many if they cannot include
       $\epsilon$ and $\varnothing$? How many if they are also
       not allowed to contain stars? How many if they are also
-      not allowed to contain $\_ + \_$?)
+      not allowed to contain $\_ + \_$?
 
 \item When are two regular expressions equivalent? Can you
       think of instances where two regular expressions match
-      teh same strings, but it is not so obvious that they do? For
+      the same strings, but it is not so obvious that they do? For
       example $a + b$ and $b + a$ do not count\ldots they
       obviously match the same strings, namely $[a]$ and $[b]$.
 
Binary file hws/hw02.pdf has changed
--- a/hws/hw02.tex	Mon Sep 22 13:42:14 2014 +0100
+++ b/hws/hw02.tex	Fri Sep 26 14:06:55 2014 +0100
@@ -9,40 +9,53 @@
 \section*{Homework 2}
 
 \begin{enumerate}
-\item Review the first handout about sets of strings and read the second handout. 
-Assuming the alphabet is $\{a, b\}$, decide which of the following equations are true
-in general for arbitrary languages $A$, $B$ and $C$:
+\item Review the first handout about sets of strings and read
+      the second handout. Assuming the alphabet is $\{a, b\}$,
+      decide which of the following equations are true in
+      general for arbitrary languages $A$, $B$ and $C$:
+
 \begin{eqnarray}
-(A \cup B) @ C & = & A @ C \cup B @ C\nonumber\\
-A^* \cup B^* & = & (A \cup B)^*\nonumber\\
-A^* @ A^*  & = & A^*\nonumber\\
-(A \cap B)@ C & = & (A@C) \cap (B@C)\nonumber
+(A \cup B) @ C & =^? & A @ C \cup B @ C\nonumber\\
+A^* \cup B^*   & =^? & (A \cup B)^*\nonumber\\
+A^* @ A^*      & =^? & A^*\nonumber\\
+(A \cap B)@ C  & =^? & (A@C) \cap (B@C)\nonumber
 \end{eqnarray}
 
-\noindent
-In case an equation is true, give an explanation; otherwise give a counter-example.
+\noindent In case an equation is true, give an explanation;
+otherwise give a counter-example.
 
-\item What is the meaning of a regular expression? Give an inductive definition.
+\item What is the meaning of a regular expression? Give an
+      inductive definition.
 
-\item Given the regular expressions $r_1 = \epsilon$ and $r_2 = \varnothing$ and $r_3 = a$.
-How many strings can the regular expressions $r_1^*$, $r_2^*$ and $r_3^*$ each match?
+\item Given the regular expressions $r_1 = \epsilon$ and $r_2
+      = \varnothing$ and $r_3 = a$. How many strings can the
+      regular expressions $r_1^*$, $r_2^*$ and $r_3^*$ each
+      match?
 
 
-\item Give regular expressions for (a) decimal numbers and for (b) binary numbers. 
-(Hint: Observe that the empty string is not a number. Also observe that leading 0s 
-are normally not written.)
+\item Give regular expressions for (a) decimal numbers and for
+      (b) binary numbers. (Hint: Observe that the empty string
+      is not a number. Also observe that leading 0s are
+      normally not written.)
 
-\item Decide whether the following two regular expressions are equivalent $(\epsilon + a)^* \equiv^? a^*$ and 
-$(a \cdot b)^* \cdot a \equiv^? a \cdot (b \cdot a)^*$.
+\item Decide whether the following two regular expressions are
+      equivalent $(\epsilon + a)^* \equiv^? a^*$ and $(a \cdot
+      b)^* \cdot a \equiv^? a \cdot (b \cdot a)^*$.
 
-\item Given the regular expression $r = (a \cdot b + b)^*$. Compute what the derivative of $r$ is with respect to
-$a$ and $b$. Is $r$ nullable?
+\item Given the regular expression $r = (a \cdot b + b)^*$.
+      Compute what the derivative of $r$ is with respect to
+      $a$, $b$ and $c$. Is $r$ nullable?
 
 \item Prove that for all regular expressions $r$ we have
-\begin{center}
-$\text{nullable}(r)$ \quad if and only if \quad $\texttt{""} \in L(r)$
+      
+\begin{center} 
+  $\textit{nullable}(r) \quad \text{if and only if} 
+  \quad [] \in L(r)$ 
 \end{center}
 
+  Write down clearly in each case what you need to prove and
+  what are the assumptions. 
+  
 \end{enumerate}
 
 \end{document}
Binary file hws/hw03.pdf has changed
--- a/hws/hw03.tex	Mon Sep 22 13:42:14 2014 +0100
+++ b/hws/hw03.tex	Fri Sep 26 14:06:55 2014 +0100
@@ -11,34 +11,41 @@
 \begin{enumerate}
 \item What is a regular language?
 
-\item Assume you have an alphabet consisting of the letters $a$, $b$ and $c$ only.
-(1) Find a regular expression that recognises the two strings $ab$ and $ac$. (2)
-Find a regular expression that matches all strings \emph{except} these two strings.
-Note, you can only use regular expressions of the form 
+\item Assume you have an alphabet consisting of the letters
+      $a$, $b$ and $c$ only. (1) Find a regular expression
+      that recognises the two strings $ab$ and $ac$. (2) Find
+      a regular expression that matches all strings
+      \emph{except} these two strings. Note, you can only use
+      regular expressions of the form 
+      
+      \begin{center} $r ::=
+      \varnothing \;|\; \epsilon \;|\; c \;|\; r_1 + r_2 \;|\;
+      r_1 \cdot r_2 \;|\; r^*$ 
+      \end{center}
+
+\item Define the function \textit{zeroable} which takes a
+      regular expression as argument and returns a boolean.
+      The function should satisfy the following property:
+
 \begin{center}
-$r ::= \varnothing \;|\; \epsilon \;|\; c  \;|\; r_1 + r_2  \;|\; r_1 \cdot r_2 \;|\; r^*$
-\end{center}
-
-\item Define the function $zeroable$ which takes a regular expression as argument
-and returns a boolean. The 
-function should satisfy the following property:
-\begin{center}
-$zeroable(r)$ \;if and only if\; $L(r) = \varnothing$
+$\textit{zeroable(r)} \;\text{if and only if}\; L(r) = \varnothing$
 \end{center}
 
 \item Define the tokens and regular expressions for a language
-consisting of numbers, left-parenthesis (, right-parenthesis ),
-identifiers and the operations $+$, $-$ and $*$. Can the following strings 
-in this language be lexed?
+      consisting of numbers, left-parenthesis $($,
+      right-parenthesis $)$, identifiers and the operations $+$,
+      $-$ and $*$. Can the following strings in this language
+      be lexed?
 
 \begin{itemize}
-\item \texttt{"}$(a + 3) * b$\texttt{"}
-\item \texttt{"}$)()++ -33$\texttt{"}
-\item \texttt{"}$(a / 3) * 3$\texttt{"}
+  \item $(a + 3) * b$
+  \item $)()++ -33$
+  \item $(a / 3) * 3$
 \end{itemize}
 
+In case they can, can you give the corresponding token
+sequences.
 
-In case they can, can you give the corresponding token sequences.
 \end{enumerate}
 
 
--- a/progs/re1.scala	Mon Sep 22 13:42:14 2014 +0100
+++ b/progs/re1.scala	Fri Sep 26 14:06:55 2014 +0100
@@ -1,7 +1,4 @@
-import scala.language.implicitConversions    
-
 abstract class Rexp
-
 case object NULL extends Rexp
 case object EMPTY extends Rexp
 case class CHAR(c: Char) extends Rexp
@@ -9,15 +6,6 @@
 case class SEQ(r1: Rexp, r2: Rexp) extends Rexp 
 case class STAR(r: Rexp) extends Rexp 
 
-// some convenience for typing in regular expressions
-def charlist2rexp(s : List[Char]) : Rexp = s match {
-  case Nil => EMPTY
-  case c::Nil => CHAR(c)
-  case c::s => SEQ(CHAR(c), charlist2rexp(s))
-}
-implicit def string2rexp(s : String) : Rexp = charlist2rexp(s.toList)
-
-
 // nullable function: tests whether the regular 
 // expression can recognise the empty string
 def nullable (r: Rexp) : Boolean = r match {
@@ -50,12 +38,13 @@
 // main matcher function
 def matcher(r: Rexp, s: String) : Boolean = nullable(ders(s.toList, r))
 
-//example
+//example from the homework
 //val r = STAR(ALT(SEQ(CHAR('a'), CHAR('b')), CHAR('b')))
 //der('a', r)
 //der('b', r)
+//der('c', r)
 
-//one or zero
+//optional: one or zero times
 def OPT(r: Rexp) = ALT(r, EMPTY)
 
 //n-times
@@ -65,8 +54,10 @@
   case n => SEQ(r, NTIMES(r, n - 1))
 }
 
-def EVIL(n: Int) = SEQ(NTIMES(OPT("a"), n), NTIMES("a", n))
+// the evil regular expression  a?{n} a{n}
+def EVIL(n: Int) = SEQ(NTIMES(OPT(CHAR('a')), n), NTIMES(CHAR('a'), n))
 
+//for measuring time
 def time_needed[T](i: Int, code: => T) = {
   val start = System.nanoTime()
   for (j <- 1 to i) code
@@ -77,5 +68,3 @@
 for (i <- 1 to 29) {
   println(i + ": " + "%.5f".format(time_needed(1, matcher(EVIL(i), "a" * i))))
 }
-
-
--- a/progs/re2.scala	Mon Sep 22 13:42:14 2014 +0100
+++ b/progs/re2.scala	Fri Sep 26 14:06:55 2014 +0100
@@ -1,26 +1,12 @@
-import scala.language.implicitConversions    
-    
 abstract class Rexp 
-
 case object NULL extends Rexp
 case object EMPTY extends Rexp
 case class CHAR(c: Char) extends Rexp
 case class ALT(r1: Rexp, r2: Rexp) extends Rexp 
 case class SEQ(r1: Rexp, r2: Rexp) extends Rexp 
 case class STAR(r: Rexp) extends Rexp 
-case class NTIMES(r: Rexp, n: Int) extends Rexp 
+case class NTIMES(r: Rexp, n: Int) extends Rexp //explicit constructor
 
-// some convenience for typing in regular expressions
-def charlist2rexp(s : List[Char]) : Rexp = s match {
-  case Nil => EMPTY
-  case c::Nil => CHAR(c)
-  case c::s => SEQ(CHAR(c), charlist2rexp(s))
-}
-implicit def string2rexp(s : String) : Rexp = charlist2rexp(s.toList)
-
-
-// nullable function: tests whether the regular 
-// expression can recognise the empty string
 def nullable (r: Rexp) : Boolean = r match {
   case NULL => false
   case EMPTY => true
@@ -31,7 +17,6 @@
   case NTIMES(r, i) => if (i == 0) true else nullable(r)
 }
 
-// derivative of a regular expression w.r.t. a character
 def der (c: Char, r: Rexp) : Rexp = r match {
   case NULL => NULL
   case EMPTY => NULL
@@ -45,20 +30,18 @@
     if (i == 0) NULL else SEQ(der(c, r), NTIMES(r, i - 1))
 }
 
-// derivative w.r.t. a string (iterates der)
 def ders (s: List[Char], r: Rexp) : Rexp = s match {
   case Nil => r
   case c::s => ders(s, der(c, r))
 }
 
-// main matcher function
 def matcher(r: Rexp, s: String) : Boolean = nullable(ders(s.toList, r))
 
 
-//one or zero
+//optional: one or zero times
 def OPT(r: Rexp) = ALT(r, EMPTY)
 
-def EVIL(n: Int) = SEQ(NTIMES(OPT("a"), n), NTIMES("a", n))
+def EVIL(n: Int) = SEQ(NTIMES(OPT(CHAR('a')), n), NTIMES(CHAR('a'), n))
 
 def time_needed[T](i: Int, code: => T) = {
   val start = System.nanoTime()
@@ -71,7 +54,7 @@
   println(i + ": " + "%.5f".format(time_needed(1, matcher(EVIL(i), "a" * i))))
 }
 
-
+//a bit bolder test
 for (i <- 1 to 1000 by 50) {
   println(i + " " + "%.5f".format(time_needed(1, matcher(EVIL(i), "a" * i))))
 }
--- a/progs/re3.scala	Mon Sep 22 13:42:14 2014 +0100
+++ b/progs/re3.scala	Fri Sep 26 14:06:55 2014 +0100
@@ -1,57 +1,37 @@
-import scala.language.implicitConversions    
-    
-abstract class Rexp {
-  def simp : Rexp = this
-}
-
+abstract class Rexp 
 case object NULL extends Rexp
 case object EMPTY extends Rexp
 case class CHAR(c: Char) extends Rexp
-case class ALT(r1: Rexp, r2: Rexp) extends Rexp {
-  override def simp = (r1.simp, r2.simp) match {
-    case (NULL, r) => r
-    case (r, NULL) => r
-    case (r, EMPTY) => if (nullable(r)) r else ALT(r, EMPTY)
-    case (EMPTY, r) => if (nullable(r)) r else ALT(r, EMPTY)
-    case (r1, r2) => if (r1 == r2) r1 else ALT(r1, r2)
+case class ALT(r1: Rexp, r2: Rexp) extends Rexp
+case class SEQ(r1: Rexp, r2: Rexp) extends Rexp 
+case class STAR(r: Rexp) extends Rexp 
+case class NTIMES(r: Rexp, n: Int) extends Rexp 
+
+def simp(r: Rexp): Rexp = r match {
+  case ALT(r1, r2) => {
+    val r1s = simp(r1)
+    val r2s = simp(r2)
+    (r1s, r2s) match {
+      case (NULL, _) => r2s
+      case (_, NULL) => r1s
+      case _ => if (r1s == r2s) r1s else ALT(r1s, r2s)
+    }
   }
-}
-case class SEQ(r1: Rexp, r2: Rexp) extends Rexp {
-  override def simp = (r1.simp, r2.simp) match {
-    case (NULL, _) => NULL
-    case (_, NULL) => NULL
-    case (EMPTY, r) => r
-    case (r, EMPTY) => r
-    case (r1, r2) => SEQ(r1, r2)
+  case SEQ(r1, r2) => {
+    val r1s = simp(r1)
+    val r2s = simp(r2)
+    (r1s, r2s) match {
+      case (NULL, _) => NULL
+      case (_, NULL) => NULL
+      case (EMPTY, _) => r2s
+      case (_, EMPTY) => r1s
+      case _ => SEQ(r1s, r2s)
+    }
   }
-}
-case class STAR(r: Rexp) extends Rexp {
-  override def simp = r.simp match {
-    case NULL => EMPTY
-    case EMPTY => EMPTY
-    case r => STAR(r)
-  }
-}
-case class NTIMES(r: Rexp, n: Int) extends Rexp {
-  override def simp = if (n == 0) EMPTY else 
-    r.simp match {
-      case NULL => NULL
-      case EMPTY => EMPTY
-      case r => NTIMES(r, n)
-    }
+  case NTIMES(r, n) => NTIMES(simp(r), n)    
+  case r => r
 }
 
-// some convenience for typing in regular expressions
-def charlist2rexp(s : List[Char]) : Rexp = s match {
-  case Nil => EMPTY
-  case c::Nil => CHAR(c)
-  case c::s => SEQ(CHAR(c), charlist2rexp(s))
-}
-implicit def string2rexp(s : String) : Rexp = charlist2rexp(s.toList)
-
-
-// nullable function: tests whether the regular 
-// expression can recognise the empty string
 def nullable (r: Rexp) : Boolean = r match {
   case NULL => false
   case EMPTY => true
@@ -62,7 +42,6 @@
   case NTIMES(r, i) => if (i == 0) true else nullable(r)
 }
 
-// derivative of a regular expression w.r.t. a character
 def der (c: Char, r: Rexp) : Rexp = r match {
   case NULL => NULL
   case EMPTY => NULL
@@ -76,20 +55,18 @@
     if (i == 0) NULL else SEQ(der(c, r), NTIMES(r, i - 1))
 }
 
-// derivative w.r.t. a string (iterates der)
 def ders (s: List[Char], r: Rexp) : Rexp = s match {
   case Nil => r
-  case c::s => ders(s, der(c, r).simp)
+  case c::s => ders(s, simp(der(c, r)))
 }
 
-// main matcher function
 def matcher(r: Rexp, s: String) : Boolean = nullable(ders(s.toList, r))
 
 
 //one or zero
 def OPT(r: Rexp) = ALT(r, EMPTY)
 
-def EVIL(n: Int) = SEQ(NTIMES(OPT("a"), n), NTIMES("a", n))
+def EVIL(n: Int) = SEQ(NTIMES(OPT(CHAR('a')), n), NTIMES(CHAR('a'), n))
 
 def time_needed[T](i: Int, code: => T) = {
   val start = System.nanoTime()
--- a/progs/scraper.scala	Mon Sep 22 13:42:14 2014 +0100
+++ b/progs/scraper.scala	Fri Sep 26 14:06:55 2014 +0100
@@ -24,7 +24,7 @@
 val wr = new OutputStreamWriter(conn.getOutputStream())
 
 //possible date ranges
-wr.write("Fdate=2012-8-24&Tdate=2012-09-25")
+wr.write("Fdate=2011-6-24&Tdate=2011-09-25")
 //wr.write("Fdate=2011-8-24&Tdate=2011-09-25")
 //wr.write("Fdate=2001-9-18&Tdate=2012-09-25")
 wr.flush
Binary file slides/slides02.pdf has changed
--- a/slides/slides02.tex	Mon Sep 22 13:42:14 2014 +0100
+++ b/slides/slides02.tex	Fri Sep 26 14:06:55 2014 +0100
@@ -1,34 +1,48 @@
 \documentclass[dvipsnames,14pt,t]{beamer}
-\usepackage{beamerthemeplaincu}
-\usepackage[absolute,overlay]{textpos}
-\usepackage{ifthen}
-\usepackage{tikz}
-\usepackage{pgf}
-\usepackage{calc} 
-\usepackage{ulem}
-\usepackage{courier}
-\usepackage{listings}
-\renewcommand{\uline}[1]{#1}
-\usetikzlibrary{arrows}
-\usetikzlibrary{automata}
-\usetikzlibrary{shapes}
-\usetikzlibrary{shadows}
-\usetikzlibrary{positioning}
-\usetikzlibrary{plotmarks}
-\usetikzlibrary{calc}
-\usepackage{graphicx} 
+\usepackage{../slides}
+\usepackage{../graphics}
 \usepackage{../langs}
 \usepackage{../data}
 
-\makeatletter
-\lst@CCPutMacro\lst@ProcessOther {"2D}{\lst@ttfamily{-{}}{-{}}}
-\@empty\z@\@empty
-\makeatother
+%\usepackage{beamerthemeplaincu}
+%\usepackage[absolute,overlay]{textpos}
+%\usepackage{ifthen}
+%\usepackage{tikz}
+%\usepackage{pgf}
+%\usepackage{calc} 
+%\usepackage{ulem}
+%\usepackage{courier}
+%\usepackage{listings}
+%\renewcommand{\uline}[1]{#1}
+%\usetikzlibrary{arrows}
+%\usetikzlibrary{automata}
+%\usetikzlibrary{shapes}
+%\usetikzlibrary{shadows}
+%\usetikzlibrary{positioning}
+%\usetikzlibrary{plotmarks}
+%\usetikzlibrary{calc}
+%\usepackage{graphicx} 
+%\usepackage{../langs}
+%\usepackage{../data}
+
+%\makeatletter
+%\lst@CCPutMacro\lst@ProcessOther {"2D}{\lst@ttfamily{-{}}{-{}}}
+%\@empty\z@\@empty
+%\makeatother
+
+\hfuzz=220pt 
+
+\lstset{language=Scala,
+        style=mystyle,
+        numbersep=0pt,
+        numbers=none,
+        xleftmargin=0mm}
+
+\newcommand{\bl}[1]{\textcolor{blue}{#1}}     
 
 % beamer stuff 
-\renewcommand{\slidecaption}{AFL 02, King's College London, 2.~October 2013}
-\newcommand{\bl}[1]{\textcolor{blue}{#1}}       
-\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}% for definitions
+\renewcommand{\slidecaption}{AFL 02, King's College London}
+
 
 \begin{document}