--- a/handouts/ho03.tex	Tue Sep 26 14:07:29 2017 +0100
+++ b/handouts/ho03.tex	Tue Sep 26 14:08:49 2017 +0100
@@ -5,39 +5,49 @@
 
 
 \begin{document}
+\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017}
 
-\section*{Handout 3 (Automata)}
+\section*{Handout 3 (Finite Automata)}
+
 
-Every formal language course I know of bombards you first with
-automata and then to a much, much smaller extend with regular
-expressions. As you can see, this course is turned upside
-down: regular expressions come first. The reason is that
-regular expressions are easier to reason about and the notion
-of derivatives, although already quite old, only became more
-widely known rather recently. Still let us in this lecture
-have a closer look at automata and their relation to regular
-expressions. This will help us with understanding why the
-regular expression matchers in Python, Ruby and Java are so slow
-with certain regular expressions. The central definition
-is:\medskip 
+Every formal language and compiler course I know of bombards you first
+with automata and then to a much, much smaller extend with regular
+expressions. As you can see, this course is turned upside down:
+regular expressions come first. The reason is that regular expressions
+are easier to reason about and the notion of derivatives, although
+already quite old, only became more widely known rather
+recently. Still, let us in this lecture have a closer look at automata
+and their relation to regular expressions. This will help us with
+understanding why the regular expression matchers in Python, Ruby and
+Java are so slow with certain regular expressions. On the way we will
+also see what are the limitations of regular
+expressions. Unfortunately, they cannot be used for \emph{everything}.
+
+
+\subsection*{Deterministic Finite Automata}
+
+Lets start\ldots the central definition is:\medskip
 
 \noindent 
 A \emph{deterministic finite automaton} (DFA), say $A$, is
-defined by a four-tuple written $A(Q, q_0, F, \delta)$ where
+given by a five-tuple written ${\cal A}(\varSigma, Qs, Q_0, F, \delta)$ where
 
 \begin{itemize}
-\item $Q$ is a finite set of states,
-\item $q_0 \in Q$ is the start state,
-\item $F \subseteq Q$ are the accepting states, and
+\item $\varSigma$ is an alphabet,  
+\item $Qs$ is a finite set of states,
+\item $Q_0 \in Qs$ is the start state,
+\item $F \subseteq Qs$ are the accepting states, and
 \item $\delta$ is the transition function.
 \end{itemize}
 
-\noindent The transition function determines how to
-``transition'' from one state to the next state with respect
-to a character. We have the assumption that these transition
-functions do not need to be defined everywhere: so it can be
-the case that given a character there is no next state, in
-which case we need to raise a kind of ``failure exception''. A
+\noindent I am sure you have seen this definition already
+before. The transition function determines how to ``transition'' from
+one state to the next state with respect to a character. We have the
+assumption that these transition functions do not need to be defined
+everywhere: so it can be the case that given a character there is no
+next state, in which case we need to raise a kind of ``failure
+exception''.  That means actually we have \emph{partial} functions as
+transitions---see the Scala implementation for DFAs later on.  A
 typical example of a DFA is
 
 \begin{center}
@@ -45,114 +55,324 @@
                     every state/.style={minimum size=0pt,
                     inner sep=2pt,draw=blue!50,very thick,
                     fill=blue!20},scale=2]
-\node[state,initial]  (q_0)  {$q_0$};
-\node[state] (q_1) [right=of q_0] {$q_1$};
-\node[state] (q_2) [below right=of q_0] {$q_2$};
-\node[state] (q_3) [right=of q_2] {$q_3$};
-\node[state, accepting] (q_4) [right=of q_1] {$q_4$};
-\path[->] (q_0) edge node [above]  {$a$} (q_1);
-\path[->] (q_1) edge node [above]  {$a$} (q_4);
-\path[->] (q_4) edge [loop right] node  {$a, b$} ();
-\path[->] (q_3) edge node [right]  {$a$} (q_4);
-\path[->] (q_2) edge node [above]  {$a$} (q_3);
-\path[->] (q_1) edge node [right]  {$b$} (q_2);
-\path[->] (q_0) edge node [above]  {$b$} (q_2);
-\path[->] (q_2) edge [loop left] node  {$b$} ();
-\path[->] (q_3) edge [bend left=95, looseness=1.3] node [below]  {$b$} (q_0);
+\node[state,initial]  (Q_0)  {$Q_0$};
+\node[state] (Q_1) [right=of Q_0] {$Q_1$};
+\node[state] (Q_2) [below right=of Q_0] {$Q_2$};
+\node[state] (Q_3) [right=of Q_2] {$Q_3$};
+\node[state, accepting] (Q_4) [right=of Q_1] {$Q_4$};
+\path[->] (Q_0) edge node [above]  {$a$} (Q_1);
+\path[->] (Q_1) edge node [above]  {$a$} (Q_4);
+\path[->] (Q_4) edge [loop right] node  {$a, b$} ();
+\path[->] (Q_3) edge node [right]  {$a$} (Q_4);
+\path[->] (Q_2) edge node [above]  {$a$} (Q_3);
+\path[->] (Q_1) edge node [right]  {$b$} (Q_2);
+\path[->] (Q_0) edge node [above]  {$b$} (Q_2);
+\path[->] (Q_2) edge [loop left] node  {$b$} ();
+\path[->] (Q_3) edge [bend left=95, looseness=1.3] node [below]  {$b$} (Q_0);
 \end{tikzpicture}
 \end{center}
 
-\noindent In this graphical notation, the accepting state
-$q_4$ is indicated with double circles. Note that there can be
-more than one accepting state. It is also possible that a DFA
-has no accepting states at all, or that the starting state is
-also an accepting state. In the case above the transition
-function is defined everywhere and can be given as a table as
-follows:
+\noindent In this graphical notation, the accepting state $Q_4$ is
+indicated with double circles. Note that there can be more than one
+accepting state. It is also possible that a DFA has no accepting
+state at all, or that the starting state is also an accepting
+state. In the case above the transition function is defined everywhere
+and can also be given as a table as follows:
 
 \[
 \begin{array}{lcl}
-(q_0, a) &\rightarrow& q_1\\
-(q_0, b) &\rightarrow& q_2\\
-(q_1, a) &\rightarrow& q_4\\
-(q_1, b) &\rightarrow& q_2\\
-(q_2, a) &\rightarrow& q_3\\
-(q_2, b) &\rightarrow& q_2\\
-(q_3, a) &\rightarrow& q_4\\
-(q_3, b) &\rightarrow& q_0\\
-(q_4, a) &\rightarrow& q_4\\
-(q_4, b) &\rightarrow& q_4\\
+(Q_0, a) &\rightarrow& Q_1\\
+(Q_0, b) &\rightarrow& Q_2\\
+(Q_1, a) &\rightarrow& Q_4\\
+(Q_1, b) &\rightarrow& Q_2\\
+(Q_2, a) &\rightarrow& Q_3\\
+(Q_2, b) &\rightarrow& Q_2\\
+(Q_3, a) &\rightarrow& Q_4\\
+(Q_3, b) &\rightarrow& Q_0\\
+(Q_4, a) &\rightarrow& Q_4\\
+(Q_4, b) &\rightarrow& Q_4\\
 \end{array}
 \]
 
+\noindent
+Please check that this table represents the same transition function
+as the graph above.
+
 We need to define the notion of what language is accepted by
 an automaton. For this we lift the transition function
 $\delta$ from characters to strings as follows:
 
 \[
 \begin{array}{lcl}
-\hat{\delta}(q, [])        & \dn & q\\
-\hat{\delta}(q, c\!::\!s) & \dn & \hat{\delta}(\delta(q, c), s)\\
+\widehat{\delta}(q, [])        & \dn & q\\
+\widehat{\delta}(q, c\!::\!s) & \dn & \widehat{\delta}(\delta(q, c), s)\\
 \end{array}
 \]
 
 \noindent This lifted transition function is often called
-``delta-hat''. Given a string, we start in the starting state
-and take the first character of the string, follow to the next
-sate, then take the second character and so on. Once the
-string is exhausted and we end up in an accepting state, then
-this string is accepted by the automaton. Otherwise it is not
-accepted. So $s$ is in the \emph{language accepted by the
-automaton} $A(Q, q_0, F, \delta)$ iff
+\emph{delta-hat}. Given a string, we start in the starting state and
+take the first character of the string, follow to the next state, then
+take the second character and so on. Once the string is exhausted and
+we end up in an accepting state, then this string is accepted by the
+automaton. Otherwise it is not accepted. This also means that if along
+the way we hit the case where the transition function $\delta$ is not
+defined, we need to raise an error. In our implementation we will deal
+with this case elegantly by using Scala's \texttt{Try}.  Summing up: a
+string $s$ is in the \emph{language accepted by the automaton} ${\cal
+  A}(\varSigma, Q, Q_0, F, \delta)$ iff
 
 \[
-\hat{\delta}(q_0, s) \in F 
+\widehat{\delta}(Q_0, s) \in F 
 \]
 
-\noindent I let you think about a definition that describes
-the set of strings accepted by an automaton.
-  
+\noindent I let you think about a definition that describes the set of
+all strings accepted by a deterministic finite automaton.
+
+\begin{figure}[p]
+\small
+\lstinputlisting[numbers=left]{../progs/display/dfa.scala}
+\caption{A Scala implementation of DFAs using partial functions.
+  Note some subtleties: \texttt{deltas} implements the delta-hat
+  construction by lifting the (partial) transition  function to lists
+  of characters. Since \texttt{delta} is given as a partial function,
+  it can obviously go ``wrong'' in which case the \texttt{Try} in
+  \texttt{accepts} catches the error and returns \texttt{false}---that
+  means the string is not accepted.  The example \texttt{delta} in
+  Line 28--38 implements the DFA example shown earlier in the
+  handout.\label{dfa}}
+\end{figure}
+
+My take on an implementation of DFAs in Scala is given in
+Figure~\ref{dfa}. As you can see, there are many features of the
+mathematical definition that are quite closely reflected in the
+code. In the DFA-class, there is a starting state, called
+\texttt{start}, with the polymorphic type \texttt{A}.  There is a
+partial function \texttt{delta} for specifying the transitions---these
+partial functions take a state (of polymorphic type \texttt{A}) and an
+input (of polymorphic type \texttt{C}) and produce a new state (of
+type \texttt{A}). For the moment it is OK to assume that \texttt{A} is
+some arbitrary type for states and the input is just characters.  (The
+reason for not having concrete types, but polymorphic types for the
+states and the input of DFAs will become clearer later on.)
+
+The DFA-class has also an argument for specifying final states. In the
+implementation it is not a set of states, as in the mathematical
+definition, but a function from states to booleans (this function is
+supposed to return true whenever a state is final; false
+otherwise). While this boolean function is different from the sets of
+states, Scala allows to use sets for such functions (see Line 40 where
+the DFA is initialised). Again it will become clear later on why I use
+functions for final states, rather than sets.
 
-While with DFAs it will always be clear that given a character
-what the next state is (potentially none), it will be useful
-to relax this restriction. That means we have several
-potential successor states. We even allow ``silent
-transitions'', also called epsilon-transitions. They allow us
-to go from one state to the next without having a character
-consumed. We label such silent transition with the letter
-$\epsilon$. The resulting construction is called a
-\emph{non-deterministic finite automaton} (NFA) given also as
-a four-tuple $A(Q, q_0, F, \rho)$ where
+The most important point in the implementation is that I use Scala's
+partial functions for representing the transitions; alternatives would
+have been \texttt{Maps} or even \texttt{Lists}. One of the main
+advantages of using partial functions is that transitions can be quite
+nicely defined by a series of \texttt{case} statements (see Lines 28
+-- 38 for an example). If you need to represent an automaton with a
+sink state (catch-all-state), you can use Scala's pattern matching and
+write something like
+
+{\small\begin{lstlisting}[language=Scala]
+abstract class State
+...
+case object Sink extends State
+
+val delta : (State, Char) :=> State = 
+  { case (S0, 'a') => S1
+    case (S1, 'a') => S2
+    case _ => Sink
+  }
+\end{lstlisting}}  
+
+\noindent I let you think what the corresponding DFA looks like in the
+graph notation. Also, I suggest you to tinker with the Scala code in
+order to define the DFA that does not accept any string at all. 
+
+Finally, I let you ponder whether this is a good implementation of
+DFAs or not. In doing so I hope you notice that the $\varSigma$ and
+$Qs$ components (the alphabet and the set of finite states,
+respectively) are missing from the class definition. This means that
+the implementation allows you to do some fishy things you are not
+meant to do with DFAs. Which fishy things could that be?
+
+
+
+\subsection*{Non-Deterministic Finite Automata}
+
+Remember we want to find out what the relation is between regular
+expressions and automata. To do this with DFAs is a bit unwieldy.
+While with DFAs it is always clear that given a state and a character
+what the next state is (potentially none), it will be convenient to
+relax this restriction. That means we allow states to have several
+potential successor states. We even allow more than one starting
+state. The resulting construction is called a \emph{Non-Deterministic
+  Finite Automaton} (NFA) given also as a five-tuple ${\cal
+  A}(\varSigma, Qs, Q_{0s}, F, \rho)$ where
 
 \begin{itemize}
-\item $Q$ is a finite set of states
-\item $q_0$ is a start state
-\item $F$ are some accepting states with $F \subseteq Q$, and
+\item $\varSigma$ is an alphabet,  
+\item $Qs$ is a finite set of states
+\item $Q_{0s}$ is a set of start states ($Q_{0s} \subseteq Qs$)
+\item $F$ are some accepting states with $F \subseteq Qs$, and
 \item $\rho$ is a transition relation.
 \end{itemize}
 
 \noindent
-Two typical examples of NFAs are
+A typical example of a NFA is
+
+% A NFA for (ab* + b)*a
+\begin{center}
+\begin{tikzpicture}[>=stealth',very thick, auto,
+    every state/.style={minimum size=0pt,inner sep=3pt,
+      draw=blue!50,very thick,fill=blue!20},scale=2]
+\node[state,initial]  (Q_0)  {$Q_0$};
+\node[state] (Q_1) [right=of Q_0] {$Q_1$};
+\node[state, accepting] (Q_2) [right=of Q_1] {$Q_2$};
+\path[->] (Q_0) edge [loop above] node  {$b$} ();
+\path[<-] (Q_0) edge node [below]  {$b$} (Q_1);
+\path[->] (Q_0) edge [bend left] node [above]  {$a$} (Q_1);
+\path[->] (Q_0) edge [bend right] node [below]  {$a$} (Q_2);
+\path[->] (Q_1) edge [loop above] node  {$a,b$} ();
+\path[->] (Q_1) edge node  [above] {$a$} (Q_2);
+\end{tikzpicture}
+\end{center}
+
+\noindent
+This NFA happens to have only one starting state, but in general there
+could be more than one.  Notice that in state $Q_0$ we might go to
+state $Q_1$ \emph{or} to state $Q_2$ when receiving an $a$. Similarly
+in state $Q_1$ and receiving an $a$, we can stay in state $Q_1$
+\emph{or} go to $Q_2$. This kind of choice is not allowed with
+DFAs. The downside of this choice in NFAs is that when it comes to
+deciding whether a string is accepted by a NFA we potentially have to
+explore all possibilities. I let you think which strings the above NFA
+accepts.
+
+
+There are a number of additional points you should note about
+NFAs. Every DFA is a NFA, but not vice versa. The $\rho$ in NFAs is a
+transition \emph{relation} (DFAs have a transition function). The
+difference between a function and a relation is that a function has
+always a single output, while a relation gives, roughly speaking,
+several outputs. Look again at the NFA above: if you are currently in
+the state $Q_1$ and you read a character $b$, then you can transition
+to either $Q_0$ \emph{or} $Q_2$. Which route, or output, you take is
+not determined.  This non-determinism can be represented by a
+relation.
+
+My implementation of NFAs in Scala is shown in Figure~\ref{nfa}.
+Perhaps interestingly, I do not actually use relations for my NFAs,
+but use transition functions that return sets of states.  DFAs have
+partial transition functions that return a single state; my NFAs
+return a set of states. I let you think about this representation for
+NFA-transitions and how it corresponds to the relations used in the
+mathematical definition of NFAs. An example of a transition function
+in Scala for the NFA shown above is
+
+{\small\begin{lstlisting}[language=Scala]
+val nfa_delta : (State, Char) :=> Set[State] = 
+  { case (Q0, 'a') => Set(Q1, Q2)
+    case (Q0, 'b') => Set(Q0)
+    case (Q1, 'a') => Set(Q1, Q2)
+    case (Q1, 'b') => Set(Q0, Q1) }
+\end{lstlisting}}  
+
+Like in the mathematical definition, \texttt{starts} is in
+NFAs a set of states; \texttt{fins} is again a function from states to
+booleans. The \texttt{next} function calculates the set of next states
+reachable from a single state \texttt{q} by a character~\texttt{c}. In
+case there is no such state---the partial transition function is
+undefined---the empty set is returned (see function
+\texttt{applyOrElse} in Lines 9 and 10). The function \texttt{nexts}
+just lifts this function to sets of states.
+ 
+\begin{figure}[p]
+\small
+\lstinputlisting[numbers=left]{../progs/display/nfa.scala}
+\caption{A Scala implementation of NFAs using partial functions.
+  Notice that the function \texttt{accepts} implements the
+  acceptance of a string in a breath-first search fashion. This can be a costly
+  way of deciding whether a string is accepted or not in applications that need to handle
+  large NFAs and large inputs.\label{nfa}}
+\end{figure}
+
+Look very careful at the \texttt{accepts} and \texttt{deltas}
+functions in NFAs and remember that when accepting a string by a NFA
+we might have to explore all possible transitions (recall which state
+to go to is not unique anymore with NFAs\ldots{}we need to explore
+potentially all next states). The implementation achieves this
+exploration through a \emph{breadth-first search}. This is fine for
+small NFAs, but can lead to real memory problems when the NFAs are
+bigger and larger strings need to be processed. As result, some
+regular expression matching engines resort to a \emph{depth-first
+  search} with \emph{backtracking} in unsuccessful cases. In our
+implementation we can implement a depth-first version of
+\texttt{accepts} using Scala's \texttt{exists}-function as follows:
+
+
+{\small\begin{lstlisting}[language=Scala]
+def search(q: A, s: List[C]) : Boolean = s match {
+  case Nil => fins(q)
+  case c::cs => next(q, c).exists(search(_, cs)) 
+}
+
+def accepts2(s: List[C]) : Boolean = 
+  starts.exists(search(_, s))
+\end{lstlisting}}
+
+\noindent
+This depth-first way of exploration seems to work quite efficiently in
+many examples and is much less of a strain on memory. The problem is
+that the backtracking can get ``catastrophic'' in some
+examples---remember the catastrophic backtracking from earlier
+lectures. This depth-first search with backtracking is the reason for
+the abysmal performance of some regular expression matchings in Java,
+Ruby and Python. I like to show you this in the next two sections.
+
+
+\subsection*{Epsilon NFAs}
+
+In order to get an idea what calculations are performed by Java \&
+friends, we need a method for transforming a regular expression into
+an automaton. This automaton should accept exactly those strings that
+are accepted by the regular expression.  The simplest and most
+well-known method for this is called \emph{Thompson Construction},
+after the Turing Award winner Ken Thompson. This method is by
+recursion over regular expressions and depends on the non-determinism
+in NFAs described in the previous section. You will see shortly why
+this construction works well with NFAs, but is not so straightforward
+with DFAs.
+
+Unfortunately we are still one step away from our intended target
+though---because this construction uses a version of NFAs that allows
+``silent transitions''. The idea behind silent transitions is that
+they allow us to go from one state to the next without having to
+consume a character. We label such silent transition with the letter
+$\epsilon$ and call the automata $\epsilon$NFAs. Two typical examples
+of $\epsilon$NFAs are:
+
+
 \begin{center}
 \begin{tabular}[t]{c@{\hspace{9mm}}c}
-\begin{tikzpicture}[scale=0.7,>=stealth',very thick,
-                             every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]
-\node[state,initial]  (q_0)  {$q_0$};
-\node[state] (q_1) [above=of q_0] {$q_1$};
-\node[state, accepting] (q_2) [below=of q_0] {$q_2$};
-\path[->] (q_0) edge node [left]  {$\epsilon$} (q_1);
-\path[->] (q_0) edge node [left]  {$\epsilon$} (q_2);
-\path[->] (q_0) edge [loop right] node  {$a$} ();
-\path[->] (q_1) edge [loop above] node  {$a$} ();
-\path[->] (q_2) edge [loop below] node  {$b$} ();
+\begin{tikzpicture}[>=stealth',very thick,
+    every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]
+\node[state,initial]  (Q_0)  {$Q_0$};
+\node[state] (Q_1) [above=of Q_0] {$Q_1$};
+\node[state, accepting] (Q_2) [below=of Q_0] {$Q_2$};
+\path[->] (Q_0) edge node [left]  {$\epsilon$} (Q_1);
+\path[->] (Q_0) edge node [left]  {$\epsilon$} (Q_2);
+\path[->] (Q_0) edge [loop right] node  {$a$} ();
+\path[->] (Q_1) edge [loop right] node  {$a$} ();
+\path[->] (Q_2) edge [loop right] node  {$b$} ();
 \end{tikzpicture} &
 
 \raisebox{20mm}{
-\begin{tikzpicture}[scale=0.7,>=stealth',very thick,
-                             every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]
-\node[state,initial]  (r_1)  {$r_1$};
-\node[state] (r_2) [above=of r_1] {$r_2$};
-\node[state, accepting] (r_3) [right=of r_1] {$r_3$};
+\begin{tikzpicture}[>=stealth',very thick,
+    every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]
+\node[state,initial]  (r_1)  {$R_1$};
+\node[state] (r_2) [above=of r_1] {$R_2$};
+\node[state, accepting] (r_3) [right=of r_1] {$R_3$};
 \path[->] (r_1) edge node [below]  {$b$} (r_3);
 \path[->] (r_2) edge [bend left] node [above]  {$a$} (r_3);
 \path[->] (r_1) edge [bend left] node  [left] {$\epsilon$} (r_2);
@@ -161,121 +381,309 @@
 \end{tabular}
 \end{center}
 
-\noindent There are, however, a number of points you should
-note. Every DFA is a NFA, but not vice versa. The $\rho$ in
-NFAs is a transition \emph{relation} (DFAs have a transition
-function). The difference between a function and a relation is
-that a function has always a single output, while a relation
-gives, roughly speaking, several outputs. Look at the NFA on
-the right-hand side above: if you are currently in the state
-$r_2$ and you read a character $a$, then you can transition to
-either $r_1$ \emph{or} $r_3$. Which route you take is not
-determined. This means if we need to decide whether a string
-is accepted by a NFA, we might have to explore all
-possibilities. Also there is the special silent transition in
-NFAs. As mentioned already this transition means you do not
-have to ``consume'' any part of the input string, but
-``silently'' change to a different state. In the left picture,
-for example, if you are in the starting state, you can 
-silently move either to $q_1$ or $q_2$. This silent
-transition is also often called \emph{$\epsilon$-transition}.
+\noindent
+Consider the $\epsilon$NFA on the left-hand side: the
+$\epsilon$-transitions mean you do not have to ``consume'' any part of
+the input string, but ``silently'' change to a different state. In
+this example, if you are in the starting state $Q_0$, you can silently
+move either to $Q_1$ or $Q_2$. You can see that once you are in $Q_1$,
+respectively $Q_2$, you cannot ``go back'' to the other states. So it
+seems allowing $\epsilon$-transitions is a rather substantial
+extension to NFAs. On first appearances, $\epsilon$-transitions might
+even look rather strange, or even silly. To start with, silent
+transitions make the decision whether a string is accepted by an
+automaton even harder: with $\epsilon$NFAs we have to look whether we
+can do first some $\epsilon$-transitions and then do a
+``proper''-transition; and after any ``proper''-transition we again
+have to check whether we can do again some silent transitions. Even
+worse, if there is a silent transition pointing back to the same
+state, then we have to be careful our decision procedure for strings
+does not loop (remember the depth-first search for exploring all
+states).
+
+The obvious question is: Do we get anything in return for this hassle
+with silent transitions?  Well, we still have to work for it\ldots
+unfortunately.  If we were to follow the many textbooks on the
+subject, we would now start with defining what $\epsilon$NFAs
+are---that would require extending the transition relation of
+NFAs. Next, we would show that the $\epsilon$NFAs are equivalent to
+NFAs and so on. Once we have done all this on paper, we would need to
+implement $\epsilon$NFAs. Lets try to take a shortcut instead. We are
+not really interested in $\epsilon$NFAs; they are only a convenient
+tool for translating regular expressions into automata. So we are not
+going to implementing them explicitly, but translate them immediately
+into NFAs (in a sense $\epsilon$NFAs are just a convenient API for
+lazy people ;o).  How does the translation work? Well we have to find
+all transitions of the form
+
+\[
+q\stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow}
+\;\stackrel{a}{\longrightarrow}\;
+\stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow} q'
+\]
+
+\noindent where somewhere in the ``middle'' is an $a$-transition. We
+replace them with $q \stackrel{a}{\longrightarrow} q'$. Doing this to
+the $\epsilon$NFA on the right-hand side above gives the NFA
+
+\begin{center}
+\begin{tikzpicture}[>=stealth',very thick,
+    every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},]
+\node[state,initial]  (r_1)  {$R_1$};
+\node[state] (r_2) [above=of r_1] {$R_2$};
+\node[state, accepting] (r_3) [right=of r_1] {$R_3$};
+\path[->] (r_1) edge node [above]  {$b$} (r_3);
+\path[->] (r_2) edge [bend left] node [above]  {$a$} (r_3);
+\path[->] (r_1) edge [bend left] node  [left] {$a$} (r_2);
+\path[->] (r_2) edge [bend left] node  [right] {$a$} (r_1);
+\path[->] (r_1) edge [loop below] node  {$a$} ();
+\path[->] (r_1) edge [bend right] node [below]  {$a$} (r_3);
+\end{tikzpicture}
+\end{center}
+
+\noindent where the single $\epsilon$-transition is replaced by
+three additional $a$-transitions. Please do the calculations yourself
+and verify that I did not forget any transition.
+
+So in what follows, whenever we are given an $\epsilon$NFA we will
+replace it by an equivalent NFA. The Scala code for this translation
+is given in Figure~\ref{enfa}. The main workhorse in this code is a
+function that calculates a fixpoint of function (Lines 5--10). This
+function is used for ``discovering'' which states are reachable by
+$\epsilon$-transitions. Once no new state is discovered, a fixpoint is
+reached. This is used for example when calculating the starting states
+of an equivalent NFA (see Line 36): we start with all starting states
+of the $\epsilon$NFA and then look for all additional states that can
+be reached by $\epsilon$-transitions. We keep on doing this until no
+new state can be reached. This is what the $\epsilon$-closure, named
+in the code \texttt{ecl}, calculates. Similarly, an accepting state of
+the NFA is when we can reach an accepting state of the $\epsilon$NFA
+by $\epsilon$-transitions.
 
 
-\subsubsection*{Thompson Construction}
+\begin{figure}[p]
+\small
+\lstinputlisting[numbers=left]{../progs/display/enfa.scala}
+
+\caption{A Scala function that translates $\epsilon$NFA into NFAs. The
+  transition function of $\epsilon$NFA takes as input an \texttt{Option[C]}.
+  \texttt{None} stands for an $\epsilon$-transition; \texttt{Some(c)}
+  for a ``proper'' transition consuming a character. The functions in
+  Lines 18--26 calculate
+  all states reachable by one or more $\epsilon$-transition for a given
+  set of states. The NFA is constructed in Lines 36--38.
+  Note the interesting commands in Lines 5 and 6: their purpose is
+  to ensure that \texttt{fixpT} is the tail-recursive version of
+  the fixpoint construction; otherwise we would quickly get a
+  stack-overflow exception, even on small examples, due to limitations
+  of the JVM.
+  \label{enfa}}
+\end{figure}
+
+Also look carefully how the transitions of $\epsilon$NFAs are
+implemented. The additional possibility of performing silent
+transitions is encoded by using \texttt{Option[C]} as the type for the
+``input''. The \texttt{Some}s stand for ``proper'' transitions where
+a character is consumed; \texttt{None} stands for
+$\epsilon$-transitions. The transition functions for the two
+$\epsilon$NFAs from the beginning of this section can be defined as
 
-The reason for introducing NFAs is that there is a relatively
-simple (recursive) translation of regular expressions into
-NFAs. Consider the simple regular expressions $\ZERO$,
-$\ONE$ and $c$. They can be translated as follows:
+{\small\begin{lstlisting}[language=Scala]
+val enfa_trans1 : (State, Option[Char]) :=> Set[State] = 
+  { case (Q0, Some('a')) => Set(Q0)
+    case (Q0, None) => Set(Q1, Q2)
+    case (Q1, Some('a')) => Set(Q1)
+    case (Q2, Some('b')) => Set(Q2) }
+
+val enfa_trans2 : (State, Option[Char]) :=> Set[State] = 
+  { case (R1, Some('b')) => Set(R3)
+    case (R1, None) => Set(R2)
+    case (R2, Some('a')) => Set(R1, R3) }
+\end{lstlisting}}
 
-\begin{center}
+\noindent
+I hope you agree now with my earlier statement that the $\epsilon$NFAs
+are just an API for NFAs.
+
+\subsection*{Thompson Construction}
+
+Having the translation of $\epsilon$NFAs to NFAs in place, we can
+finally return to the problem of translating regular expressions into
+equivalent NFAs. Recall that by equivalent we mean that the NFAs
+recognise the same language. Consider the simple regular expressions
+$\ZERO$, $\ONE$ and $c$. They can be translated into equivalent NFAs
+as follows:
+
+\begin{equation}\mbox{
 \begin{tabular}[t]{l@{\hspace{10mm}}l}
 \raisebox{1mm}{$\ZERO$} & 
 \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
-\node[state, initial]  (q_0)  {$\mbox{}$};
+\node[state, initial]  (Q_0)  {$\mbox{}$};
 \end{tikzpicture}\\\\
 \raisebox{1mm}{$\ONE$} & 
 \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
-\node[state, initial, accepting]  (q_0)  {$\mbox{}$};
+\node[state, initial, accepting]  (Q_0)  {$\mbox{}$};
 \end{tikzpicture}\\\\
-\raisebox{2mm}{$c$} & 
+\raisebox{3mm}{$c$} & 
 \begin{tikzpicture}[scale=0.7,>=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
-\node[state, initial]  (q_0)  {$\mbox{}$};
-\node[state, accepting]  (q_1)  [right=of q_0] {$\mbox{}$};
-\path[->] (q_0) edge node [below]  {$c$} (q_1);
-\end{tikzpicture}\\\\
-\end{tabular}
-\end{center}
+\node[state, initial]  (Q_0)  {$\mbox{}$};
+\node[state, accepting]  (Q_1)  [right=of Q_0] {$\mbox{}$};
+\path[->] (Q_0) edge node [below]  {$c$} (Q_1);
+\end{tikzpicture}\\
+\end{tabular}}\label{simplecases}
+\end{equation}
+
+\noindent
+I let you think whether the NFAs can match exactly those strings the
+regular expressions can match. To do this translation in code we need
+a way to construct states ``programatically''...and as an additional
+constraint Scala needs to recognise that these states are being distinct.
+For this I implemented in Figure~\ref{thompson1} a class
+\texttt{TState} that includes a counter and a companion object that
+increases this counter whenever a new state is created.\footnote{You might
+  have to read up what \emph{companion objects} do in Scala.}
 
-\noindent The case for the sequence regular expression $r_1
-\cdot r_2$ is as follows: We are given by recursion two
-automata representing $r_1$ and $r_2$ respectively. 
+\begin{figure}[p]
+\small
+\lstinputlisting[numbers=left]{../progs/display/thompson1.scala}
+\caption{The first part of the Thompson Construction. Lines 7--16
+  implement a way of how to create new states that are all
+  distinct by virtue of a counter. This counter is
+  increased in the companion object of \texttt{TState}
+  whenever a new state is created. The code in Lines 24--40
+  constructs NFAs for the simple regular expressions $\ZERO$, $\ONE$ and $c$.
+  Compare this code with the pictures given in \eqref{simplecases} on
+  Page~\pageref{simplecases}.
+  \label{thompson1}}
+\end{figure}
+
+\begin{figure}[p]
+\small
+\lstinputlisting[numbers=left]{../progs/display/thompson2.scala}
+\caption{The second part of the Thompson Construction implementing
+  the composition of NFAs according to $\cdot$, $+$ and ${}^*$.
+  The implicit class about rich partial functions
+  implements the infix operation \texttt{+++} which
+  combines an $\epsilon$NFA transition with a NFA transition
+  (both are given as partial functions---but with different type!).\label{thompson2}}
+\end{figure}
+
+The case for the sequence regular expression $r_1 \cdot r_2$ is a bit more
+complicated: Say, we are given by recursion two NFAs representing the regular
+expressions $r_1$ and $r_2$ respectively.
 
 \begin{center}
 \begin{tikzpicture}[node distance=3mm,
-                             >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
-\node[state, initial]  (q_0)  {$\mbox{}$};
-\node (r_1)  [right=of q_0] {$\ldots$};
-\node[state, accepting]  (t_1)  [right=of r_1] {$\mbox{}$};
-\node[state, accepting]  (t_2)  [above=of t_1] {$\mbox{}$};
-\node[state, accepting]  (t_3)  [below=of t_1] {$\mbox{}$};
-\node[state, initial]  (a_0)  [right=2.5cm of t_1] {$\mbox{}$};
-\node (b_1)  [right=of a_0] {$\ldots$};
+    >=stealth',very thick,
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+\node[state, initial]  (Q_0)  {$\mbox{}$};
+\node[state, initial]  (Q_01) [below=1mm of Q_0] {$\mbox{}$};  
+\node[state, initial]  (Q_02) [above=1mm of Q_0] {$\mbox{}$};  
+\node (R_1)  [right=of Q_0] {$\ldots$};
+\node[state, accepting]  (T_1)  [right=of R_1] {$\mbox{}$};
+\node[state, accepting]  (T_2)  [above=of T_1] {$\mbox{}$};
+\node[state, accepting]  (T_3)  [below=of T_1] {$\mbox{}$};
+
+\node (A_0)  [right=2.5cm of T_1] {$\mbox{}$};
+\node[state, initial]  (A_01)  [above=1mm of A_0] {$\mbox{}$};
+\node[state, initial]  (A_02)  [below=1mm of A_0] {$\mbox{}$};
+
+\node (b_1)  [right=of A_0] {$\ldots$};
 \node[state, accepting]  (c_1)  [right=of b_1] {$\mbox{}$};
 \node[state, accepting]  (c_2)  [above=of c_1] {$\mbox{}$};
 \node[state, accepting]  (c_3)  [below=of c_1] {$\mbox{}$};
 \begin{pgfonlayer}{background}
-\node (1) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (q_0) (r_1) (t_1) (t_2) (t_3)] {};
-\node (2) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (a_0) (b_1) (c_1) (c_2) (c_3)] {};
+  \node (1) [rounded corners, inner sep=1mm, thick,
+    draw=black!60, fill=black!20, fit= (Q_0) (R_1) (T_1) (T_2) (T_3)] {};
+  \node (2) [rounded corners, inner sep=1mm, thick,
+    draw=black!60, fill=black!20, fit= (A_0) (b_1) (c_1) (c_2) (c_3)] {};
 \node [yshift=2mm] at (1.north) {$r_1$};
 \node [yshift=2mm] at (2.north) {$r_2$};
 \end{pgfonlayer}
 \end{tikzpicture}
 \end{center}
 
-\noindent The first automaton has some accepting states. We
-obtain an automaton for $r_1\cdot r_2$ by connecting these
-accepting states with $\epsilon$-transitions to the starting
-state of the second automaton. By doing so we make them
-non-accepting like so:
+\noindent The first NFA has some accepting states and the second some
+starting states. We obtain an $\epsilon$NFA for $r_1\cdot r_2$ by
+connecting the accepting states of the first NFA with
+$\epsilon$-transitions to the starting states of the second
+automaton. By doing so we make the accepting states of the first NFA
+to be non-accepting like so:
 
 \begin{center}
 \begin{tikzpicture}[node distance=3mm,
-                             >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
-\node[state, initial]  (q_0)  {$\mbox{}$};
-\node (r_1)  [right=of q_0] {$\ldots$};
+    >=stealth',very thick,
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+\node[state, initial]  (Q_0)  {$\mbox{}$};
+\node[state, initial]  (Q_01) [below=1mm of Q_0] {$\mbox{}$};  
+\node[state, initial]  (Q_02) [above=1mm of Q_0] {$\mbox{}$};  
+\node (r_1)  [right=of Q_0] {$\ldots$};
 \node[state]  (t_1)  [right=of r_1] {$\mbox{}$};
 \node[state]  (t_2)  [above=of t_1] {$\mbox{}$};
 \node[state]  (t_3)  [below=of t_1] {$\mbox{}$};
-\node[state]  (a_0)  [right=2.5cm of t_1] {$\mbox{}$};
-\node (b_1)  [right=of a_0] {$\ldots$};
+
+\node  (A_0)  [right=2.5cm of t_1] {$\mbox{}$};
+\node[state]  (A_01)  [above=1mm of A_0] {$\mbox{}$};
+\node[state]  (A_02)  [below=1mm of A_0] {$\mbox{}$};
+
+\node (b_1)  [right=of A_0] {$\ldots$};
 \node[state, accepting]  (c_1)  [right=of b_1] {$\mbox{}$};
 \node[state, accepting]  (c_2)  [above=of c_1] {$\mbox{}$};
 \node[state, accepting]  (c_3)  [below=of c_1] {$\mbox{}$};
-\path[->] (t_1) edge node [above, pos=0.3]  {$\epsilon$} (a_0);
-\path[->] (t_2) edge node [above]  {$\epsilon$} (a_0);
-\path[->] (t_3) edge node [below]  {$\epsilon$} (a_0);
-
+\path[->] (t_1) edge (A_01);
+\path[->] (t_2) edge node [above]  {$\epsilon$s} (A_01);
+\path[->] (t_3) edge (A_01);
+\path[->] (t_1) edge (A_02);
+\path[->] (t_2) edge (A_02);
+\path[->] (t_3) edge node [below]  {$\epsilon$s} (A_02);
+ 
 \begin{pgfonlayer}{background}
-\node (3) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (q_0) (c_1) (c_2) (c_3)] {};
+  \node (3) [rounded corners, inner sep=1mm, thick,
+    draw=black!60, fill=black!20, fit= (Q_0) (c_1) (c_2) (c_3)] {};
 \node [yshift=2mm] at (3.north) {$r_1\cdot r_2$};
 \end{pgfonlayer}
 \end{tikzpicture}
 \end{center}
 
-\noindent The case for the choice regular expression $r_1 +
-r_2$ is slightly different: We are given by recursion two
-automata representing $r_1$ and $r_2$ respectively. 
+\noindent The idea behind this construction is that the start of any
+string is first recognised by the first NFA, then we silently change
+to the second NFA; the ending of the string is recognised by the
+second NFA...just like matching of a string by the regular expression
+$r_1\cdot r_2$. The Scala code for this construction is given in
+Figure~\ref{thompson2} in Lines 16--23. The starting states of the
+$\epsilon$NFA are the starting states of the first NFA (corresponding
+to $r_1$); the accepting function is the accepting function of the
+second NFA (corresponding to $r_2$). The new transition function is
+all the ``old'' transitions plus the $\epsilon$-transitions connecting
+the accepting states of the first NFA to the starting states of the
+first NFA (Lines 18 and 19). The $\epsilon$NFA is then immediately
+translated in a NFA.
+
+
+The case for the alternative regular expression $r_1 + r_2$ is
+slightly different: We are given by recursion two NFAs representing
+$r_1$ and $r_2$ respectively. Each NFA has some starting states and
+some accepting states. We obtain a NFA for the regular expression $r_1
++ r_2$ by composing the transition functions (this crucially depends
+on knowing that the states of each component NFA are distinct---recall
+we implemented for this to hold some bespoke code for states). We also
+need to combine the starting states and accepting functions
+appropriately.
 
 \begin{center}
+\begin{tabular}[t]{ccc}
 \begin{tikzpicture}[node distance=3mm,
-                             >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+    >=stealth',very thick,
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+    baseline=(current bounding box.center)]
 \node at (0,0)  (1)  {$\mbox{}$};
-\node[state, initial]  (2)  [above right=16mm of 1] {$\mbox{}$};
-\node[state, initial]  (3)  [below right=16mm of 1] {$\mbox{}$};
+\node  (2)  [above=10mm of 1] {};
+\node[state, initial]  (4)  [above=1mm of 2] {$\mbox{}$};
+\node[state, initial]  (5)  [below=1mm of 2] {$\mbox{}$};
+\node[state, initial]  (3)  [below=10mm of 1] {$\mbox{}$};
 
-\node (a)  [right=of 2] {$\ldots$};
-\node[state, accepting]  (a1)  [right=of a] {$\mbox{}$};
+\node (a)  [right=of 2] {$\ldots\,$};
+\node  (a1)  [right=of a] {$$};
 \node[state, accepting]  (a2)  [above=of a1] {$\mbox{}$};
 \node[state, accepting]  (a3)  [below=of a1] {$\mbox{}$};
 
@@ -290,23 +698,21 @@
 \node [yshift=3mm] at (2.north) {$r_2$};
 \end{pgfonlayer}
 \end{tikzpicture}
-\end{center}
-
-\noindent Each automaton has a single start state and
-potentially several accepting states. We obtain a NFA for the
-regular expression $r_1 + r_2$ by introducing a new starting
-state and connecting it with an $\epsilon$-transition to the
-two starting states above, like so
+&
+\mbox{}\qquad\tikz{\draw[>=stealth,line width=2mm,->] (0,0) -- (1, 0)}\quad\mbox{}
+&
+\begin{tikzpicture}[node distance=3mm,
+    >=stealth',very thick,
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+    baseline=(current bounding box.center)]
+\node at (0,0) (1)  {$\mbox{}$};
+\node (2)  [above=10mm of 1] {$$};
+\node[state, initial]  (4)  [above=1mm of 2] {$\mbox{}$};
+\node[state, initial]  (5)  [below=1mm of 2] {$\mbox{}$};
+\node[state, initial]  (3)  [below=10mm of 1] {$\mbox{}$};
 
-\begin{center}
-\hspace{2cm}\begin{tikzpicture}[node distance=3mm,
-                             >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
-\node at (0,0) [state, initial]  (1)  {$\mbox{}$};
-\node[state]  (2)  [above right=16mm of 1] {$\mbox{}$};
-\node[state]  (3)  [below right=16mm of 1] {$\mbox{}$};
-
-\node (a)  [right=of 2] {$\ldots$};
-\node[state, accepting]  (a1)  [right=of a] {$\mbox{}$};
+\node (a)  [right=of 2] {$\ldots\,$};
+\node (a1)  [right=of a] {$$};
 \node[state, accepting]  (a2)  [above=of a1] {$\mbox{}$};
 \node[state, accepting]  (a3)  [below=of a1] {$\mbox{}$};
 
@@ -315,24 +721,34 @@
 \node[state, accepting]  (b2)  [above=of b1] {$\mbox{}$};
 \node[state, accepting]  (b3)  [below=of b1] {$\mbox{}$};
 
-\path[->] (1) edge node [above]  {$\epsilon$} (2);
-\path[->] (1) edge node [below]  {$\epsilon$} (3);
+%\path[->] (1) edge node [above]  {$\epsilon$} (2);
+%\path[->] (1) edge node [below]  {$\epsilon$} (3);
 
 \begin{pgfonlayer}{background}
 \node (3) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (1) (a2) (a3) (b2) (b3)] {};
 \node [yshift=3mm] at (3.north) {$r_1+ r_2$};
 \end{pgfonlayer}
 \end{tikzpicture}
+\end{tabular}
 \end{center}
 
-\noindent 
-Finally for the $*$-case we have an automaton for $r$
+\noindent The code for this construction is in Figure~\ref{thompson2}
+in Lines 25--33.
+
+Finally for the $*$-case we have a NFA for $r$ and connect its
+accepting states to a new starting state via
+$\epsilon$-transitions. This new starting state is also an accepting
+state, because $r^*$ can recognise the empty string.
 
 \begin{center}
+\begin{tabular}[b]{@{}ccc@{}}  
 \begin{tikzpicture}[node distance=3mm,
-                             >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
-\node at (0,0)  (1)  {$\mbox{}$};
-\node[state, initial]  (2)  [right=16mm of 1] {$\mbox{}$};
+    >=stealth',very thick,
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+    baseline=(current bounding box.north)]
+\node (2)  {$\mbox{}$};
+\node[state, initial]  (4)  [above=1mm of 2] {$\mbox{}$};
+\node[state, initial]  (5)  [below=1mm of 2] {$\mbox{}$};  
 \node (a)  [right=of 2] {$\ldots$};
 \node[state, accepting]  (a1)  [right=of a] {$\mbox{}$};
 \node[state, accepting]  (a2)  [above=of a1] {$\mbox{}$};
@@ -342,341 +758,360 @@
 \node [yshift=3mm] at (1.north) {$r$};
 \end{pgfonlayer}
 \end{tikzpicture}
-\end{center}
-
-\noindent and connect its accepting states to a new starting
-state via $\epsilon$-transitions. This new starting state is
-also an accepting state, because $r^*$ can recognise the
-empty string. This gives the following automaton for $r^*$:
-
-\begin{center}
+&
+\raisebox{-16mm}{\;\tikz{\draw[>=stealth,line width=2mm,->] (0,0) -- (1, 0)}}
+&
 \begin{tikzpicture}[node distance=3mm,
-                             >=stealth',very thick, every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},]
+    >=stealth',very thick,
+    every state/.style={minimum size=3pt,draw=blue!50,very thick,fill=blue!20},
+    baseline=(current bounding box.north)]
 \node at (0,0) [state, initial,accepting]  (1)  {$\mbox{}$};
-\node[state]  (2)  [right=16mm of 1] {$\mbox{}$};
+\node (2)  [right=16mm of 1] {$\mbox{}$};
+\node[state]  (4)  [above=1mm of 2] {$\mbox{}$};
+\node[state]  (5)  [below=1mm of 2] {$\mbox{}$};  
 \node (a)  [right=of 2] {$\ldots$};
 \node[state]  (a1)  [right=of a] {$\mbox{}$};
 \node[state]  (a2)  [above=of a1] {$\mbox{}$};
 \node[state]  (a3)  [below=of a1] {$\mbox{}$};
-\path[->] (1) edge node [above]  {$\epsilon$} (2);
-\path[->] (a1) edge [bend left=45] node [above]  {$\epsilon$} (1);
+\path[->] (1) edge node [below]  {$\epsilon$} (4);
+\path[->] (1) edge node [below]  {$\epsilon$} (5);
+\path[->] (a1) edge [bend left=45] node [below]  {$\epsilon$} (1);
 \path[->] (a2) edge [bend right] node [below]  {$\epsilon$} (1);
 \path[->] (a3) edge [bend left=45] node [below]  {$\epsilon$} (1);
 \begin{pgfonlayer}{background}
 \node (2) [rounded corners, inner sep=1mm, thick, draw=black!60, fill=black!20, fit= (1) (a2) (a3)] {};
 \node [yshift=3mm] at (2.north) {$r^*$};
 \end{pgfonlayer}
-\end{tikzpicture}
+\end{tikzpicture}    
+\end{tabular}
 \end{center}
 
-\noindent This construction of a NFA from a regular expression
-was invented by Ken Thompson in 1968.
+\noindent 
+The corresponding code is in Figure~\ref{thompson2} in Lines 35--43)
+
+To sum up, you can see in the sequence and star cases the need of
+having silent $\epsilon$-transitions. Similarly the alternative case
+shows the need of the NFA-nondeterminism. It seems awkward to form the
+`alternative' composition of two DFAs, because DFA do not allow
+several starting and successor states. All these constructions can now
+be put together in the following recursive function:
+
+
+{\small\begin{lstlisting}[language=Scala]
+def thompson(r: Rexp) : NFAt = r match {
+  case ZERO => NFA_ZERO()
+  case ONE => NFA_ONE()
+  case CHAR(c) => NFA_CHAR(c)  
+  case ALT(r1, r2) => NFA_ALT(thompson(r1), thompson(r2))
+  case SEQ(r1, r2) => NFA_SEQ(thompson(r1), thompson(r2))
+  case STAR(r1) => NFA_STAR(thompson(r1))
+}
+\end{lstlisting}}
+
+\noindent
+It calculates a NFA from a regular expressions. At last we can run
+NFAs for the our evil regular expression examples.  The graph on the
+left shows that when translating a regular expression such as
+$a^{\{n\}}$ into a NFA, the size can blow up and then even the
+relative fast (on small examples) breadth-first search can be
+slow. The graph on the right shows that with `evil' regular
+expressions the depth-first search can be abysmally slow. Even if the
+graphs not completely overlap with the curves of Python, Ruby and
+Java, they are similar enough. OK\ldots now you know why regular
+expression matchers in those languages are so slow. 
 
 
-\subsubsection*{Subset Construction}
+\begin{center}
+\begin{tabular}{@{\hspace{-1mm}}c@{\hspace{1mm}}c@{}}  
+\begin{tikzpicture}
+\begin{axis}[
+    title={Graph: $a^{?\{n\}} \cdot a^{\{n\}}$ and strings 
+           $\underbrace{\texttt{a}\ldots \texttt{a}}_{n}$},
+    title style={yshift=-2ex},
+    xlabel={$n$},
+    x label style={at={(1.05,0.0)}},
+    ylabel={time in secs},
+    enlargelimits=false,
+    xtick={0,5,...,30},
+    xmax=33,
+    ymax=35,
+    ytick={0,5,...,30},
+    scaled ticks=false,
+    axis lines=left,
+    width=5.5cm,
+    height=4cm, 
+    legend entries={Python,Ruby, breadth-first NFA},
+    legend style={at={(0.5,-0.25)},anchor=north,font=\small},
+    legend cell align=left]
+  \addplot[blue,mark=*, mark options={fill=white}] table {re-python.data};
+  \addplot[brown,mark=triangle*, mark options={fill=white}] table {re-ruby.data};
+  % breath-first search in NFAs
+  \addplot[red,mark=*, mark options={fill=white}] table {
+    1 0.00586
+    2 0.01209
+    3 0.03076
+    4 0.08269
+    5 0.12881
+    6 0.25146
+    7 0.51377
+    8 0.89079
+    9 1.62802
+    10 3.05326
+    11 5.92437
+    12 11.67863
+    13 24.00568
+  };
+\end{axis}
+\end{tikzpicture}
+&
+\begin{tikzpicture}
+\begin{axis}[
+    title={Graph: $(a^*)^* \cdot b$ and strings 
+           $\underbrace{\texttt{a}\ldots \texttt{a}}_{n}$},
+    title style={yshift=-2ex},
+    xlabel={$n$},
+    x label style={at={(1.05,0.0)}},
+    ylabel={time in secs},
+    enlargelimits=false,
+    xtick={0,5,...,30},
+    xmax=33,
+    ymax=35,
+    ytick={0,5,...,30},
+    scaled ticks=false,
+    axis lines=left,
+    width=5.5cm,
+    height=4cm, 
+    legend entries={Python, Java, depth-first NFA},
+    legend style={at={(0.5,-0.25)},anchor=north,font=\small},
+    legend cell align=left]
+  \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data};
+  \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data};  
+  % depth-first search in NFAs
+  \addplot[red,mark=*, mark options={fill=white}] table {
+    1 0.00605
+    2 0.03086
+    3 0.11994
+    4 0.45389
+    5 2.06192
+    6 8.04894
+    7 32.63549
+  };
+\end{axis}
+\end{tikzpicture}
+\end{tabular}
+\end{center}
 
-What is interesting is that for every NFA we can find a DFA
-which recognises the same language. This can, for example, be
-done by the \emph{subset construction}. Consider again the NFA
-below on the left, consisting of nodes labeled $0$, $1$ and $2$. 
+
+
+\subsection*{Subset Construction}
+
+Of course, some developers of regular expression matchers are aware of
+these problems with sluggish NFAs and try to address them. One common
+technique for alleviating the problem I like to show you in this
+section. This will also explain why we insisted on polymorphic types in
+our DFA code (remember I used \texttt{A} and \texttt{C} for the types
+of states and the input, see Figure~\ref{dfa} on
+Page~\pageref{dfa}).\bigskip
+
+\noindent
+To start remember that we did not bother with defining and
+implementing $\epsilon$NFAs: we immediately translated them into
+equivalent NFAs. Equivalent in the sense of accepting the same
+language (though we only claimed this and did not prove it
+rigorously). Remember also that NFAs have non-deterministic
+transitions defined as a relation or implemented as function returning
+sets of states.  This non-determinism is crucial for the Thompson
+Construction to work (recall the cases for $\cdot$, $+$ and
+${}^*$). But this non-determinism makes it harder with NFAs to decide
+when a string is accepted or not; whereas such a decision is rather
+straightforward with DFAs: recall their transition function is a
+\emph{function} that returns a single state. So with DFAs we do not
+have to search at all.  What is perhaps interesting is the fact that
+for every NFA we can find a DFA that also recognises the same
+language. This might sound a bit paradoxical: NFA $\rightarrow$
+decision of acceptance hard; DFA $\rightarrow$ decision easy. But this
+\emph{is} true\ldots but of course there is always a caveat---nothing
+ever is for free in life.
+
+There are actually a number of methods for transforming a NFA into
+an equivalent DFA, but the most famous one is the \emph{subset
+  construction}. Consider the following NFA where the states are
+labelled with $0$, $1$ and $2$.
 
 \begin{center}
 \begin{tabular}{c@{\hspace{10mm}}c}
 \begin{tikzpicture}[scale=0.7,>=stealth',very thick,
                     every state/.style={minimum size=0pt,
                     draw=blue!50,very thick,fill=blue!20},
-                    baseline=0mm]
-\node[state,initial]  (q_0)  {$0$};
-\node[state] (q_1) [above=of q_0] {$1$};
-\node[state, accepting] (q_2) [below=of q_0] {$2$};
-\path[->] (q_0) edge node [left]  {$\epsilon$} (q_1);
-\path[->] (q_0) edge node [left]  {$\epsilon$} (q_2);
-\path[->] (q_0) edge [loop right] node  {$a$} ();
-\path[->] (q_1) edge [loop above] node  {$a$} ();
-\path[->] (q_2) edge [loop below] node  {$b$} ();
+                    baseline=(current bounding box.center)]
+\node[state,initial]  (Q_0)  {$0$};
+\node[state] (Q_1) [below=of Q_0] {$1$};
+\node[state, accepting] (Q_2) [below=of Q_1] {$2$};
+
+\path[->] (Q_0) edge node [right]  {$b$} (Q_1);
+\path[->] (Q_1) edge node [right]  {$a,b$} (Q_2);
+\path[->] (Q_0) edge [loop above] node  {$a, b$} ();
 \end{tikzpicture}
 &
-\begin{tabular}{r|cl}
-nodes & $a$ & $b$\\
+\begin{tabular}{r|ll}
+states & $a$ & $b$\\
 \hline
 $\{\}\phantom{\star}$ & $\{\}$ & $\{\}$\\
-$\{0\}\phantom{\star}$       & $\{0,1,2\}$   & $\{2\}$\\
-$\{1\}\phantom{\star}$       & $\{1\}$       & $\{\}$\\
-$\{2\}\star$  & $\{\}$ & $\{2\}$\\
-$\{0,1\}\phantom{\star}$     & $\{0,1,2\}$   & $\{2\}$\\
-$\{0,2\}\star$ & $\{0,1,2\}$   & $\{2\}$\\
-$\{1,2\}\star$ & $\{1\}$       & $\{2\}$\\
-s: $\{0,1,2\}\star$ & $\{0,1,2\}$ & $\{2\}$\\
+start: $\{0\}\phantom{\star}$       & $\{0\}$   & $\{0,1\}$\\
+$\{1\}\phantom{\star}$       & $\{2\}$       & $\{2\}$\\
+$\{2\}\star$  & $\{\}$ & $\{\}$\\
+$\{0,1\}\phantom{\star}$     & $\{0,2\}$   & $\{0,1,2\}$\\
+$\{0,2\}\star$ & $\{0\}$   & $\{0,1\}$\\
+$\{1,2\}\star$ & $\{2\}$       & $\{2\}$\\
+$\{0,1,2\}\star$ & $\{0,2\}$ & $\{0,1,2\}$\\
 \end{tabular}
 \end{tabular}
 \end{center}
 
-\noindent The nodes of the DFA are given by calculating all
-subsets of the set of nodes of the NFA (seen in the nodes
-column on the right). The table shows the transition function
-for the DFA. The first row states that $\{\}$ is the
-sink node which has transitions for $a$ and $b$ to itself.
-The next three lines are calculated as follows: 
+\noindent The states of the corresponding DFA are given by generating
+all subsets of the set $\{0,1,2\}$ (seen in the states column
+in the table on the right). The other columns define the transition
+function for the DFA for inputs $a$ and $b$. The first row states that
+$\{\}$ is the sink state which has transitions for $a$ and $b$ to
+itself.  The next three lines are calculated as follows:
 
 \begin{itemize}
-\item suppose you calculate the entry for the transition for
-      $a$ and the node $\{0\}$
-\item start from the node $0$ in the NFA
-\item do as many $\epsilon$-transition as you can obtaining a
-      set of nodes, in this case $\{0,1,2\}$
-\item filter out all notes that do not allow an $a$-transition
-      from this set, this excludes $2$ which does not permit a
-      $a$-transition
-\item from the remaining set, do as many $\epsilon$-transition
-      as you can, this yields again $\{0,1,2\}$     
-\item the resulting set specifies the transition from $\{0\}$
-      when given an $a$ 
+\item Suppose you calculate the entry for the $a$-transition for state
+  $\{0\}$. Look for all states in the NFA that can be reached by such
+  a transition from this state; this is only state $0$; therefore from
+  state $\{0\}$ we can go to state $\{0\}$ via an $a$-transition.
+\item Do the same for the $b$-transition; you can reach states $0$ and
+  $1$ in the NFA; therefore in the DFA we can go from state $\{0\}$ to
+  state $\{0,1\}$ via an $b$-transition.
+\item Continue with the states $\{1\}$ and $\{2\}$.
 \end{itemize}
 
-\noindent So the transition from the state $\{0\}$ reading an
-$a$ goes to the state $\{0,1,2\}$. Similarly for the other
-entries in the rows for $\{0\}$, $\{1\}$ and $\{2\}$. The
-other rows are calculated by just taking the union of the
-single node entries. For example for $a$ and $\{0,1\}$ we need
-to take the union of $\{0,1,2\}$ (for $0$) and $\{1\}$ (for
-$1$). The starting state of the DFA can be calculated from the
-starting state of the NFA, that is $0$, and then do as many
-$\epsilon$-transitions as possible. This gives $\{0,1,2\}$
-which is the starting state of the DFA. The terminal states in
-the DFA are given by all sets that contain a $2$, which is the
-terminal state of the NFA. This completes the subset
-construction. So the corresponding DFA to the NFA from 
-above is
+\noindent
+Once you filled in the transitions for `simple' states $\{0\}$
+.. $\{2\}$, you only have to build the union for the compound states
+$\{0,1\}$, $\{0,2\}$ and so on. For example for $\{0,1\}$ you take the
+union of Line $\{0\}$ and Line $\{1\}$, which gives $\{0,2\}$ for $a$,
+and $\{0,1,2\}$ for $b$. And so on.
 
-\begin{center}
-\begin{tikzpicture}[scale=0.7,>=stealth',very thick,
-                    every state/.style={minimum size=0pt,
-                    draw=blue!50,very thick,fill=blue!20},
-                    baseline=0mm]
-\node[state,initial,accepting]  (q012)  {$0,1,2$};
-\node[state,accepting] (q02) [right=of q012] {$0,2$};
-\node[state] (q01) [above=of q02] {$0,1$};
-\node[state,accepting] (q12) [below=of q02] {$1,2$};
-\node[state] (q0) [right=2cm of q01] {$0$};
-\node[state] (q1) [right=2.5cm of q02] {$1$};
-\node[state,accepting] (q2) [right=1.5cm of q12] {$2$};
-\node[state] (qn) [right=of q1] {$\{\}$};
+The starting state of the DFA can be calculated from the starting
+states of the NFA, that is in this case $\{0\}$. But in general there
+can of course be many starting states in the NFA and you would take
+the corresponding subset as \emph{the} starting state of the DFA.
+
+The accepting states in the DFA are given by all sets that contain a
+$2$, which is the only accpting state in this NFA. But again in
+general if the subset contains any accepting state from the NFA, then
+the corresponding state in the DFA is accepting as well.  This
+completes the subset construction. The corresponding DFA for the NFA
+shown above is:
 
-\path[->] (q012) edge [loop below] node  {$a$} ();
-\path[->] (q012) edge node [above]  {$b$} (q2);
-\path[->] (q12) edge [bend left] node [below,pos=0.4]  {$a$} (q1);
-\path[->] (q12) edge node [below]  {$b$} (q2);
-\path[->] (q02) edge node [above]  {$a$} (q012);
-\path[->] (q02) edge [bend left] node [above, pos=0.8]  {$b$} (q2);
-\path[->] (q01) edge node [below]  {$a$} (q012);
-\path[->] (q01) edge [bend left] node [above]  {$b$} (q2);
-\path[->] (q0) edge node [below]  {$a$} (q012);
-\path[->] (q0) edge node [right, pos=0.2]  {$b$} (q2);
-\path[->] (q1) edge [loop above] node  {$a$} ();
-\path[->] (q1) edge node [above]  {$b$} (qn);
-\path[->] (q2) edge [loop right] node  {$b$} ();
-\path[->] (q2) edge node [below]  {$a$} (qn);
-\path[->] (qn) edge [loop above] node  {$a,b$} ();
-\end{tikzpicture}
-\end{center}
-
-
+\begin{equation}
+\begin{tikzpicture}[scale=0.8,>=stealth',very thick,
+                    every state/.style={minimum size=0pt,
+                      draw=blue!50,very thick,fill=blue!20},
+                    baseline=(current bounding box.center)]
+\node[state,initial]  (q0)  {$0$};
+\node[state] (q01) [right=of q0] {$0,1$};
+\node[state,accepting] (q02) [below=of q01] {$0,2$};
+\node[state,accepting] (q012) [right=of q02] {$0,1,2$};
+\node[state] (q1) [below=0.5cm of q0] {$1$};
+\node[state,accepting] (q2) [below=1cm of q1] {$2$};
+\node[state] (qn) [below left=1cm of q2] {$\{\}$};
+\node[state,accepting] (q12) [below right=1cm of q2] {$1,2$};
 
-There are two points to note: One is that very often the
-resulting DFA contains a number of ``dead'' nodes that are
-never reachable from the starting state. For example
-there is no way to reach node $\{0,2\}$ from the starting
-state $\{0,1,2\}$. I let you find the other dead states.
-In effect the DFA in this example is not a minimal DFA. Such
-dead nodes can be safely removed without changing the language
-that is recognised by the DFA. Another point is that in some
-cases, however, the subset construction produces a DFA that
-does \emph{not} contain any dead nodes\ldots{}that means it
-calculates a minimal DFA. Which in turn means that in some
-cases the number of nodes by going from NFAs to DFAs
-exponentially increases, namely by $2^n$ (which is the number
-of subsets you can form for $n$ nodes). 
-
-Removing all the dead states in the automaton above,
-gives a much more legible automaton, namely
+\path[->] (q0) edge node [above] {$b$} (q01);
+\path[->] (q01) edge node [above] {$b$} (q012);
+\path[->] (q0) edge [loop above] node  {$a$} ();
+\path[->] (q012) edge [loop right] node  {$b$} ();
+\path[->] (q012) edge node [below] {$a$} (q02);
+\path[->] (q02) edge node [below] {$a$} (q0);
+\path[->] (q01) edge [bend left] node [left]  {$a$} (q02);
+\path[->] (q02) edge [bend left] node [right]  {$b$} (q01);
+\path[->] (q1) edge node [left] {$a,b$} (q2);
+\path[->] (q12) edge node [right] {$a, b$} (q2);
+\path[->] (q2) edge node [right] {$a, b$} (qn);
+\path[->] (qn) edge [loop left] node  {$a,b$} ();
+\end{tikzpicture}\label{subsetdfa}
+\end{equation}
 
-\begin{center}
-\begin{tikzpicture}[scale=0.7,>=stealth',very thick,
-                    every state/.style={minimum size=0pt,
-                    draw=blue!50,very thick,fill=blue!20},
-                    baseline=0mm]
-\node[state,initial,accepting]  (q012)  {$0,1,2$};
-\node[state,accepting] (q2) [right=of q012] {$2$};
-\node[state] (qn) [right=of q2] {$\{\}$};
-
-\path[->] (q012) edge [loop below] node  {$a$} ();
-\path[->] (q012) edge node [above]  {$b$} (q2);
-\path[->] (q2) edge [loop below] node  {$b$} ();
-\path[->] (q2) edge node [below]  {$a$} (qn);
-\path[->] (qn) edge [loop above] node  {$a,b$} ();
-\end{tikzpicture}
-\end{center}
-
-\noindent Now the big question is whether this DFA
-can recognise the same language as the NFA we started with.
+\noindent
+Please check that this is indeed a DFA. The big question is whether
+this DFA can recognise the same language as the NFA we started with?
 I let you ponder about this question.
 
-\subsubsection*{Brzozowski's Method}
-
-As said before, we can also go into the other direction---from
-DFAs to regular expressions. Brzozowski's method calculates
-a regular expression using familiar transformations for
-solving equational systems. Consider the DFA:
-
-\begin{center}
-\begin{tikzpicture}[scale=1.5,>=stealth',very thick,auto,
-                    every state/.style={minimum size=0pt,
-                    inner sep=2pt,draw=blue!50,very thick,
-                    fill=blue!20}]
-  \node[state, initial]        (q0) at ( 0,1) {$q_0$};
-  \node[state]                    (q1) at ( 1,1) {$q_1$};
-  \node[state, accepting] (q2) at ( 2,1) {$q_2$};
-  \path[->] (q0) edge[bend left] node[above] {$a$} (q1)
-            (q1) edge[bend left] node[above] {$b$} (q0)
-            (q2) edge[bend left=50] node[below] {$b$} (q0)
-            (q1) edge node[above] {$a$} (q2)
-            (q2) edge [loop right] node {$a$} ()
-            (q0) edge [loop below] node {$b$} ();
-\end{tikzpicture}
-\end{center}
-
-\noindent for which we can set up the following equational
-system
-
-\begin{eqnarray}
-q_0 & = & \ONE + q_0\,b + q_1\,b +  q_2\,b\\
-q_1 & = & q_0\,a\\
-q_2 & = & q_1\,a + q_2\,a
-\end{eqnarray}
 
-\noindent There is an equation for each node in the DFA. Let
-us have a look how the right-hand sides of the equations are
-constructed. First have a look at the second equation: the
-left-hand side is $q_1$ and the right-hand side $q_0\,a$. The
-right-hand side is essentially all possible ways how to end up
-in node $q_1$. There is only one incoming edge from $q_0$ consuming
-an $a$.  Therefore the right hand side is this
-state followed by character---in this case $q_0\,a$. Now lets
-have a look at the third equation: there are two incoming
-edges for $q_2$. Therefore we have two terms, namely $q_1\,a$ and
-$q_2\,a$. These terms are separated by $+$. The first states
-that if in state $q_1$ consuming an $a$ will bring you to
-$q_2$, and the secont that being in $q_2$ and consuming an $a$
-will make you stay in $q_2$. The right-hand side of the
-first equation is constructed similarly: there are three 
-incoming edges, therefore there are three terms. There is
-one exception in that we also ``add'' $\ONE$ to the
-first equation, because it corresponds to the starting state
-in the DFA.
+There are also two points to note: One is that very often in the
+subset construction the resulting DFA contains a number of ``dead''
+states that are never reachable from the starting state. This is
+obvious in the example, where state $\{1\}$, $\{2\}$, $\{1,2\}$ and
+$\{\}$ can never be reached from the starting state. But this might
+not always be as obvious as that. In effect the DFA in this example is
+not a \emph{minimal} DFA (more about this in a minute). Such dead
+states can be safely removed without changing the language that is
+recognised by the DFA. Another point is that in some cases, however,
+the subset construction produces a DFA that does \emph{not} contain
+any dead states\ldots{}this means it calculates a minimal DFA. Which
+in turn means that in some cases the number of states can by going
+from NFAs to DFAs exponentially increase, namely by $2^n$ (which is
+the number of subsets you can form for sets of $n$ states).  This blow
+up in the number of states in the DFA is again bad news for how
+quickly you can decide whether a string is accepted by a DFA or
+not. So the caveat with DFAs is that they might make the task of
+finding the next state trival, but might require $2^n$ times as many
+states then a NFA.\bigskip
 
-Having constructed the equational system, the question is
-how to solve it? Remarkably the rules are very similar to
-solving usual linear equational systems. For example the
-second equation does not contain the variable $q_1$ on the
-right-hand side of the equation. We can therefore eliminate 
-$q_1$ from the system by just substituting this equation
-into the other two. This gives
+\noindent
+To conclude this section, how conveniently we can
+implement the subset construction with our versions of NFAs and
+DFAs? Very conveninetly. The code is just:
 
-\begin{eqnarray}
-q_0 & = & \ONE + q_0\,b + q_0\,a\,b +  q_2\,b\\
-q_2 & = & q_0\,a\,a + q_2\,a
-\end{eqnarray}
-
-\noindent where in Equation (4) we have two occurences
-of $q_0$. Like the laws about $+$ and $\cdot$, we can simplify 
-Equation (4) to obtain the following two equations:
+{\small\begin{lstlisting}[language=Scala]
+def subset[A, C](nfa: NFA[A, C]) : DFA[Set[A], C] = {
+  DFA(nfa.starts, 
+      { case (qs, c) => nfa.nexts(qs, c) }, 
+      _.exists(nfa.fins))
+}
+\end{lstlisting}}  
 
-\begin{eqnarray}
-q_0 & = & \ONE + q_0\,(b + a\,b) +  q_2\,b\\
-q_2 & = & q_0\,a\,a + q_2\,a
-\end{eqnarray}
- 
-\noindent Unfortunately we cannot make any more progress with
-substituting equations, because both (6) and (7) contain the
-variable on the left-hand side also on the right-hand side.
-Here we need to now use a law that is different from the usual
-laws about linear equations. It is called \emph{Arden's rule}.
-It states that if an equation is of the form $q = q\,r + s$
-then it can be transformed to $q = s\, r^*$. Since we can
-assume $+$ is symmetric, Equation (7) is of that form: $s$ is
-$q_0\,a\,a$ and $r$ is $a$. That means we can transform
-(7) to obtain the two new equations
-
-\begin{eqnarray}
-q_0 & = & \ONE + q_0\,(b + a\,b) +  q_2\,b\\
-q_2 & = & q_0\,a\,a\,(a^*)
-\end{eqnarray}
+\noindent
+The interesting point in this code is that the state type of the
+calculated DFA is \texttt{Set[A]}. Think carefully that this works out
+correctly.
 
-\noindent Now again we can substitute the second equation into
-the first in order to eliminate the variable $q_2$.
-
-\begin{eqnarray}
-q_0 & = & \ONE + q_0\,(b + a\,b) +  q_0\,a\,a\,(a^*)\,b
-\end{eqnarray}
-
-\noindent Pulling $q_0$ out as a single factor gives:
-
-\begin{eqnarray}
-q_0 & = & \ONE + q_0\,(b + a\,b + a\,a\,(a^*)\,b)
-\end{eqnarray}
-
-\noindent This equation is again of the form so that we can
-apply Arden's rule ($r$ is $b + a\,b + a\,a\,(a^*)\,b$ and $s$
-is $\ONE$). This gives as solution for $q_0$ the following
-regular expression:
-
-\begin{eqnarray}
-q_0 & = & \ONE\,(b + a\,b + a\,a\,(a^*)\,b)^*
-\end{eqnarray}
+The DFA is then given by three components: the starting states, the
+transition function and the accepting-states function.  The starting
+states are a set in the given NFA, but a single state in the DFA.  The
+transition function, given the state \texttt{qs} and input \texttt{c},
+needs to produce the next state: this is the set of all NFA states
+that are reachable from each state in \texttt{qs}. The function
+\texttt{nexts} from the NFA class already calculates this for us. The
+accepting-states function for the DFA is true henevner at least one
+state in the subset is accepting (that is true) in the NFA.\medskip
 
-\noindent Since this is a regular expression, we can simplify
-away the $\ONE$ to obtain the slightly simpler regular
-expression
-
-\begin{eqnarray}
-q_0 & = & (b + a\,b + a\,a\,(a^*)\,b)^*
-\end{eqnarray}
-
-\noindent 
-Now we can unwind this process and obtain the solutions
-for the other equations. This gives:
-
-\begin{eqnarray}
-q_0 & = & (b + a\,b + a\,a\,(a^*)\,b)^*\\
-q_1 & = & (b + a\,b + a\,a\,(a^*)\,b)^*\,a\\
-q_2 & = & (b + a\,b + a\,a\,(a^*)\,b)^*\,a\,a\,(a)^*
-\end{eqnarray}
+\noindent
+You might be able to spend some quality time tinkering with this code
+and time to ponder about it. Then you will probably notice that it is
+actually a bit silly. The whole point of translating the NFA into a
+DFA via the subset construction is to make the decision of whether a
+string is accepted or not faster. Given the code above, the generated
+DFA will be exactly as fast, or as slow, as the NFA we started with
+(actually it will even be a tiny bit slower). The reason is that we
+just re-use the \texttt{nexts} function from the NFA. This function
+implements the non-deterministic breadth-first search.  You might be
+thinking: This is cheating! \ldots{} Well, not quite as you will see
+later, but in terms of speed we still need to work a bit in order to
+get sometimes(!) a faster DFA. Let's do this next.
 
-\noindent Finally, we only need to ``add'' up the equations
-which correspond to a terminal state. In our running example,
-this is just $q_2$. Consequently, a regular expression
-that recognises the same language as the automaton is
-
-\[
-(b + a\,b + a\,a\,(a^*)\,b)^*\,a\,a\,(a)^*
-\]
+\subsection*{DFA Minimisation}
 
-\noindent You can somewhat crosscheck your solution
-by taking a string the regular expression can match and
-and see whether it can be matched by the automaton.
-One string for example is $aaa$ and \emph{voila} this 
-string is also matched by the automaton.
-
-We should prove that Brzozowski's method really produces
-an equivalent  regular expression for the automaton. But
-for the purposes of this module, we omit this.
-
-\subsubsection*{Automata Minimization}
-
-As seen in the subset construction, the translation 
-of an NFA to a DFA can result in a rather ``inefficient'' 
-DFA. Meaning there are states that are not needed. A
-DFA can be \emph{minimised} by the following algorithm:
+As seen in \eqref{subsetdfa}, the subset construction from NFA to a
+DFA can result in a rather ``inefficient'' DFA. Meaning there are
+states that are not needed. There are two kinds of such unneeded
+states: \emph{unreachable} states and \emph{non-distinguishable}
+states. The first kind of states can just be removed without affecting
+the language that can be recognised (after all they are
+unreachable). The second kind can also be recognised and thus a DFA
+can be \emph{minimised} by the following algorithm:
 
 \begin{enumerate}
 \item Take all pairs $(q, p)$ with $q \not= p$
@@ -693,29 +1128,30 @@
 \item All unmarked pairs can be merged.
 \end{enumerate}
 
-\noindent To illustrate this algorithm, consider the following 
-DFA.
+\noindent Unfortunately, once we throw away all unreachable states in
+\eqref{subsetdfa}, all remaining states are needed.  In order to
+illustrate the minimisation algorithm, consider the following DFA.
 
 \begin{center}
 \begin{tikzpicture}[>=stealth',very thick,auto,
                     every state/.style={minimum size=0pt,
                     inner sep=2pt,draw=blue!50,very thick,
                     fill=blue!20}]
-\node[state,initial]  (q_0)  {$q_0$};
-\node[state] (q_1) [right=of q_0] {$q_1$};
-\node[state] (q_2) [below right=of q_0] {$q_2$};
-\node[state] (q_3) [right=of q_2] {$q_3$};
-\node[state, accepting] (q_4) [right=of q_1] {$q_4$};
-\path[->] (q_0) edge node [above]  {$a$} (q_1);
-\path[->] (q_1) edge node [above]  {$a$} (q_4);
-\path[->] (q_4) edge [loop right] node  {$a, b$} ();
-\path[->] (q_3) edge node [right]  {$a$} (q_4);
-\path[->] (q_2) edge node [above]  {$a$} (q_3);
-\path[->] (q_1) edge node [right]  {$b$} (q_2);
-\path[->] (q_0) edge node [above]  {$b$} (q_2);
-\path[->] (q_2) edge [loop left] node  {$b$} ();
-\path[->] (q_3) edge [bend left=95, looseness=1.3] node 
-  [below]  {$b$} (q_0);
+\node[state,initial]  (Q_0)  {$Q_0$};
+\node[state] (Q_1) [right=of Q_0] {$Q_1$};
+\node[state] (Q_2) [below right=of Q_0] {$Q_2$};
+\node[state] (Q_3) [right=of Q_2] {$Q_3$};
+\node[state, accepting] (Q_4) [right=of Q_1] {$Q_4$};
+\path[->] (Q_0) edge node [above]  {$a$} (Q_1);
+\path[->] (Q_1) edge node [above]  {$a$} (Q_4);
+\path[->] (Q_4) edge [loop right] node  {$a, b$} ();
+\path[->] (Q_3) edge node [right]  {$a$} (Q_4);
+\path[->] (Q_2) edge node [above]  {$a$} (Q_3);
+\path[->] (Q_1) edge node [right]  {$b$} (Q_2);
+\path[->] (Q_0) edge node [above]  {$b$} (Q_2);
+\path[->] (Q_2) edge [loop left] node  {$b$} ();
+\path[->] (Q_3) edge [bend left=95, looseness=1.3] node 
+  [below]  {$b$} (Q_0);
 \end{tikzpicture}
 \end{center}
 
@@ -736,15 +1172,15 @@
 \draw (3,0) -- (3, 2);
 \draw (4,0) -- (4, 1);
 
-\draw (0.5,-0.5) node {$q_0$}; 
-\draw (1.5,-0.5) node {$q_1$}; 
-\draw (2.5,-0.5) node {$q_2$}; 
-\draw (3.5,-0.5) node {$q_3$};
+\draw (0.5,-0.5) node {$Q_0$}; 
+\draw (1.5,-0.5) node {$Q_1$}; 
+\draw (2.5,-0.5) node {$Q_2$}; 
+\draw (3.5,-0.5) node {$Q_3$};
  
-\draw (-0.5, 3.5) node {$q_1$}; 
-\draw (-0.5, 2.5) node {$q_2$}; 
-\draw (-0.5, 1.5) node {$q_3$}; 
-\draw (-0.5, 0.5) node {$q_4$}; 
+\draw (-0.5, 3.5) node {$Q_1$}; 
+\draw (-0.5, 2.5) node {$Q_2$}; 
+\draw (-0.5, 1.5) node {$Q_3$}; 
+\draw (-0.5, 0.5) node {$Q_4$}; 
 
 \draw (0.5,0.5) node {\large$\star$}; 
 \draw (1.5,0.5) node {\large$\star$}; 
@@ -755,10 +1191,10 @@
 
 \noindent where the lower row is filled with stars, because in
 the corresponding pairs there is always one state that is
-accepting ($q_4$) and a state that is non-accepting (the other
+accepting ($Q_4$) and a state that is non-accepting (the other
 states).
 
-Now in Step 3 we need to fill in more stars according whether 
+In Step 3 we need to fill in more stars according whether 
 one of the next-state pairs are marked. We have to do this 
 for every unmarked field until there is no change anymore.
 This gives the triangle
@@ -777,15 +1213,15 @@
 \draw (3,0) -- (3, 2);
 \draw (4,0) -- (4, 1);
 
-\draw (0.5,-0.5) node {$q_0$}; 
-\draw (1.5,-0.5) node {$q_1$}; 
-\draw (2.5,-0.5) node {$q_2$}; 
-\draw (3.5,-0.5) node {$q_3$};
+\draw (0.5,-0.5) node {$Q_0$}; 
+\draw (1.5,-0.5) node {$Q_1$}; 
+\draw (2.5,-0.5) node {$Q_2$}; 
+\draw (3.5,-0.5) node {$Q_3$};
  
-\draw (-0.5, 3.5) node {$q_1$}; 
-\draw (-0.5, 2.5) node {$q_2$}; 
-\draw (-0.5, 1.5) node {$q_3$}; 
-\draw (-0.5, 0.5) node {$q_4$}; 
+\draw (-0.5, 3.5) node {$Q_1$}; 
+\draw (-0.5, 2.5) node {$Q_2$}; 
+\draw (-0.5, 1.5) node {$Q_3$}; 
+\draw (-0.5, 0.5) node {$Q_4$}; 
 
 \draw (0.5,0.5) node {\large$\star$}; 
 \draw (1.5,0.5) node {\large$\star$}; 
@@ -798,27 +1234,202 @@
 \end{tikzpicture}
 \end{center}
 
-\noindent which means states $q_0$ and $q_2$, as well as $q_1$
-and $q_3$ can be merged. This gives the following minimal DFA
+\noindent which means states $Q_0$ and $Q_2$, as well as $Q_1$
+and $Q_3$ can be merged. This gives the following minimal DFA
 
 \begin{center}
 \begin{tikzpicture}[>=stealth',very thick,auto,
                     every state/.style={minimum size=0pt,
                     inner sep=2pt,draw=blue!50,very thick,
                     fill=blue!20}]
-\node[state,initial]  (q_02)  {$q_{0, 2}$};
-\node[state] (q_13) [right=of q_02] {$q_{1, 3}$};
-\node[state, accepting] (q_4) [right=of q_13] 
-  {$q_{4\phantom{,0}}$};
-\path[->] (q_02) edge [bend left] node [above]  {$a$} (q_13);
-\path[->] (q_13) edge [bend left] node [below]  {$b$} (q_02);
-\path[->] (q_02) edge [loop below] node  {$b$} ();
-\path[->] (q_13) edge node [above]  {$a$} (q_4);
-\path[->] (q_4) edge [loop above] node  {$a, b$} ();
+\node[state,initial]  (Q_02)  {$Q_{0, 2}$};
+\node[state] (Q_13) [right=of Q_02] {$Q_{1, 3}$};
+\node[state, accepting] (Q_4) [right=of Q_13] 
+  {$Q_{4\phantom{,0}}$};
+\path[->] (Q_02) edge [bend left] node [above]  {$a$} (Q_13);
+\path[->] (Q_13) edge [bend left] node [below]  {$b$} (Q_02);
+\path[->] (Q_02) edge [loop below] node  {$b$} ();
+\path[->] (Q_13) edge node [above]  {$a$} (Q_4);
+\path[->] (Q_4) edge [loop above] node  {$a, b$} ();
+\end{tikzpicture}
+\end{center}
+
+
+By the way, we are not bothering with implementing the above
+minimisation algorith: while up to now all the transformations used
+some clever composition of functions, the minimisation algorithm
+cannot be implemented by just composing some functions. For this we
+would require a more concrete representation of the transition
+function (like maps). If we did this, however, then many advantages of
+the functions would be thrown away. So the compromise is to not being
+able to minimise (easily) our DFAs.
+
+\subsection*{Brzozowski's Method}
+
+I know this handout is already a long, long rant: but after all it is
+a topic that has been researched for more than 60 years. If you
+reflect on what you have read so far, the story is that you can take a
+regular expression, translate it via the Thompson Construction into an
+$\epsilon$NFA, then translate it into a NFA by removing all
+$\epsilon$-transitions, and then via the subset construction obtain a
+DFA. In all steps we made sure the language, or which strings can be
+recognised, stays the same. Of couse we should have proved this in
+each step, but let us cut corners here.  After the last section, we
+can even minimise the DFA (maybe not in code). But again we made sure
+the same language is recognised. You might be wondering: Can we go
+into the other direction?  Can we go from a DFA and obtain a regular
+expression that can recognise the same language as the DFA?\medskip
+
+\noindent
+The answer is yes. Again there are several methods for calculating a
+regular expression for a DFA. I will show you Brzozowski's method
+because it calculates a regular expression using quite familiar
+transformations for solving equational systems. Consider the DFA:
+
+\begin{center}
+\begin{tikzpicture}[scale=1.5,>=stealth',very thick,auto,
+                    every state/.style={minimum size=0pt,
+                    inner sep=2pt,draw=blue!50,very thick,
+                    fill=blue!20}]
+  \node[state, initial]        (q0) at ( 0,1) {$Q_0$};
+  \node[state]                    (q1) at ( 1,1) {$Q_1$};
+  \node[state, accepting] (q2) at ( 2,1) {$Q_2$};
+  \path[->] (q0) edge[bend left] node[above] {$a$} (q1)
+            (q1) edge[bend left] node[above] {$b$} (q0)
+            (q2) edge[bend left=50] node[below] {$b$} (q0)
+            (q1) edge node[above] {$a$} (q2)
+            (q2) edge [loop right] node {$a$} ()
+            (q0) edge [loop below] node {$b$} ();
 \end{tikzpicture}
 \end{center}
 
-\subsubsection*{Regular Languages}
+\noindent for which we can set up the following equational
+system
+
+\begin{eqnarray}
+Q_0 & = & \ONE + Q_0\,b + Q_1\,b +  Q_2\,b\\
+Q_1 & = & Q_0\,a\\
+Q_2 & = & Q_1\,a + Q_2\,a
+\end{eqnarray}
+
+\noindent There is an equation for each node in the DFA. Let
+us have a look how the right-hand sides of the equations are
+constructed. First have a look at the second equation: the
+left-hand side is $Q_1$ and the right-hand side $Q_0\,a$. The
+right-hand side is essentially all possible ways how to end up
+in node $Q_1$. There is only one incoming edge from $Q_0$ consuming
+an $a$.  Therefore the right hand side is this
+state followed by character---in this case $Q_0\,a$. Now lets
+have a look at the third equation: there are two incoming
+edges for $Q_2$. Therefore we have two terms, namely $Q_1\,a$ and
+$Q_2\,a$. These terms are separated by $+$. The first states
+that if in state $Q_1$ consuming an $a$ will bring you to
+$Q_2$, and the second that being in $Q_2$ and consuming an $a$
+will make you stay in $Q_2$. The right-hand side of the
+first equation is constructed similarly: there are three 
+incoming edges, therefore there are three terms. There is
+one exception in that we also ``add'' $\ONE$ to the
+first equation, because it corresponds to the starting state
+in the DFA.
+
+Having constructed the equational system, the question is
+how to solve it? Remarkably the rules are very similar to
+solving usual linear equational systems. For example the
+second equation does not contain the variable $Q_1$ on the
+right-hand side of the equation. We can therefore eliminate 
+$Q_1$ from the system by just substituting this equation
+into the other two. This gives
+
+\begin{eqnarray}
+Q_0 & = & \ONE + Q_0\,b + Q_0\,a\,b +  Q_2\,b\\
+Q_2 & = & Q_0\,a\,a + Q_2\,a
+\end{eqnarray}
+
+\noindent where in Equation (4) we have two occurrences
+of $Q_0$. Like the laws about $+$ and $\cdot$, we can simplify 
+Equation (4) to obtain the following two equations:
+
+\begin{eqnarray}
+Q_0 & = & \ONE + Q_0\,(b + a\,b) +  Q_2\,b\\
+Q_2 & = & Q_0\,a\,a + Q_2\,a
+\end{eqnarray}
+ 
+\noindent Unfortunately we cannot make any more progress with
+substituting equations, because both (6) and (7) contain the
+variable on the left-hand side also on the right-hand side.
+Here we need to now use a law that is different from the usual
+laws about linear equations. It is called \emph{Arden's rule}.
+It states that if an equation is of the form $q = q\,r + s$
+then it can be transformed to $q = s\, r^*$. Since we can
+assume $+$ is symmetric, Equation (7) is of that form: $s$ is
+$Q_0\,a\,a$ and $r$ is $a$. That means we can transform
+(7) to obtain the two new equations
+
+\begin{eqnarray}
+Q_0 & = & \ONE + Q_0\,(b + a\,b) +  Q_2\,b\\
+Q_2 & = & Q_0\,a\,a\,(a^*)
+\end{eqnarray}
+
+\noindent Now again we can substitute the second equation into
+the first in order to eliminate the variable $Q_2$.
+
+\begin{eqnarray}
+Q_0 & = & \ONE + Q_0\,(b + a\,b) +  Q_0\,a\,a\,(a^*)\,b
+\end{eqnarray}
+
+\noindent Pulling $Q_0$ out as a single factor gives:
+
+\begin{eqnarray}
+Q_0 & = & \ONE + Q_0\,(b + a\,b + a\,a\,(a^*)\,b)
+\end{eqnarray}
+
+\noindent This equation is again of the form so that we can
+apply Arden's rule ($r$ is $b + a\,b + a\,a\,(a^*)\,b$ and $s$
+is $\ONE$). This gives as solution for $Q_0$ the following
+regular expression:
+
+\begin{eqnarray}
+Q_0 & = & \ONE\,(b + a\,b + a\,a\,(a^*)\,b)^*
+\end{eqnarray}
+
+\noindent Since this is a regular expression, we can simplify
+away the $\ONE$ to obtain the slightly simpler regular
+expression
+
+\begin{eqnarray}
+Q_0 & = & (b + a\,b + a\,a\,(a^*)\,b)^*
+\end{eqnarray}
+
+\noindent 
+Now we can unwind this process and obtain the solutions
+for the other equations. This gives:
+
+\begin{eqnarray}
+Q_0 & = & (b + a\,b + a\,a\,(a^*)\,b)^*\\
+Q_1 & = & (b + a\,b + a\,a\,(a^*)\,b)^*\,a\\
+Q_2 & = & (b + a\,b + a\,a\,(a^*)\,b)^*\,a\,a\,(a)^*
+\end{eqnarray}
+
+\noindent Finally, we only need to ``add'' up the equations
+which correspond to a terminal state. In our running example,
+this is just $Q_2$. Consequently, a regular expression
+that recognises the same language as the DFA is
+
+\[
+(b + a\,b + a\,a\,(a^*)\,b)^*\,a\,a\,(a)^*
+\]
+
+\noindent You can somewhat crosscheck your solution by taking a string
+the regular expression can match and and see whether it can be matched
+by the DFA.  One string for example is $aaa$ and \emph{voila} this
+string is also matched by the automaton.
+
+We should prove that Brzozowski's method really produces an equivalent
+regular expression. But for the purposes of this module, we omit
+this. I guess you are relieved.
+
+
+\subsection*{Regular Languages}
 
 Given the constructions in the previous sections we obtain 
 the following overall picture:
@@ -844,7 +1455,7 @@
 there exists a regular expression that can recognise the same
 language. Again we did not prove this fact. 
 
-The interesting conclusion is that automata and regular 
+The fundamental conclusion we can draw is that automata and regular
 expressions can recognise the same set of languages:
 
 \begin{quote} A language is \emph{regular} iff there exists a
@@ -857,76 +1468,109 @@
 automaton that recognises all its strings.
 \end{quote}
 
-\noindent So for deciding whether a string is recognised by a
-regular expression, we could use our algorithm based on
-derivatives or NFAs or DFAs. But let us quickly look at what
-the differences mean in computational terms. Translating a
-regular expression into a NFA gives us an automaton that has
-$O(n)$ nodes---that means the size of the NFA grows linearly
-with the size of the regular expression. The problem with NFAs
-is that the problem of deciding whether a string is accepted
-or not is computationally not cheap. Remember with NFAs we
-have potentially many next states even for the same input and
-also have the silent $\epsilon$-transitions. If we want to
-find a path from the starting state of an NFA to an accepting
-state, we need to consider all possibilities. In Ruby and
-Python this is done by a depth-first search, which in turn
-means that if a ``wrong'' choice is made, the algorithm has to
-backtrack and thus explore all potential candidates. This is
-exactly the reason why Ruby and Python are so slow for evil
-regular expressions. An alternative to the potentially slow
-depth-first search is to explore the search space in a
-breadth-first fashion, but this might incur a big memory
-penalty.  
+\noindent Note that this is not a stement for a particular language
+(that is a particular set of strings), but about a large class of
+languages, namely the regular ones.
 
-To avoid the problems with NFAs, we can translate them 
-into DFAs. With DFAs the problem of deciding whether a
-string is recognised or not is much simpler, because in
-each state it is completely determined what the next
-state will be for a given input. So no search is needed.
-The problem with this is that the translation to DFAs
-can explode exponentially the number of states. Therefore when 
-this route is taken, we definitely need to minimise the
-resulting DFAs in order to have an acceptable memory 
-and runtime behaviour. But remember the subset construction
-in the worst case explodes the number of states by $2^n$.
-Effectively also the translation to DFAs can incur a big
+As a consequence for deciding whether a string is recognised by a
+regular expression, we could use our algorithm based on derivatives or
+NFAs or DFAs. But let us quickly look at what the differences mean in
+computational terms. Translating a regular expression into a NFA gives
+us an automaton that has $O(n)$ states---that means the size of the
+NFA grows linearly with the size of the regular expression. The
+problem with NFAs is that the problem of deciding whether a string is
+accepted or not is computationally not cheap. Remember with NFAs we
+have potentially many next states even for the same input and also
+have the silent $\epsilon$-transitions. If we want to find a path from
+the starting state of a NFA to an accepting state, we need to consider
+all possibilities. In Ruby, Python and Java this is done by a
+depth-first search, which in turn means that if a ``wrong'' choice is
+made, the algorithm has to backtrack and thus explore all potential
+candidates. This is exactly the reason why Ruby, Python and Java are
+so slow for evil regular expressions. An alternative to the
+potentially slow depth-first search is to explore the search space in
+a breadth-first fashion, but this might incur a big memory penalty.
+
+To avoid the problems with NFAs, we can translate them into DFAs. With
+DFAs the problem of deciding whether a string is recognised or not is
+much simpler, because in each state it is completely determined what
+the next state will be for a given input. So no search is needed.  The
+problem with this is that the translation to DFAs can explode
+exponentially the number of states. Therefore when this route is
+taken, we definitely need to minimise the resulting DFAs in order to
+have an acceptable memory and runtime behaviour. But remember the
+subset construction in the worst case explodes the number of states by
+$2^n$.  Effectively also the translation to DFAs can incur a big
 runtime penalty.
 
-But this does not mean that everything is bad with automata.
-Recall the problem of finding a regular expressions for the
-language that is \emph{not} recognised by a regular
-expression. In our implementation we added explicitly such a
-regular expressions because they are useful for recognising
-comments. But in principle we did not need to. The argument
-for this is as follows: take a regular expression, translate
+But this does not mean that everything is bad with automata.  Recall
+the problem of finding a regular expressions for the language that is
+\emph{not} recognised by a regular expression. In our implementation
+we added explicitly such a regular expressions because they are useful
+for recognising comments. But in principle we did not need to. The
+argument for this is as follows: take a regular expression, translate
 it into a NFA and then a DFA that both recognise the same
-language. Once you have the DFA it is very easy to construct
-the automaton for the language not recognised by an DFA. If
-the DFA is completed (this is important!), then you just need
-to exchange the accepting and non-accepting states. You can
-then translate this DFA back into a regular expression and
-that will be the regular expression that can match all strings
-the original regular expression could \emph{not} match.
+language. Once you have the DFA it is very easy to construct the
+automaton for the language not recognised by a DFA. If the DFA is
+completed (this is important!), then you just need to exchange the
+accepting and non-accepting states. You can then translate this DFA
+back into a regular expression and that will be the regular expression
+that can match all strings the original regular expression could
+\emph{not} match.
 
-It is also interesting that not all languages are regular. The
-most well-known example of a language that is not regular
-consists of all the strings of the form
+It is also interesting that not all languages are regular. The most
+well-known example of a language that is not regular consists of all
+the strings of the form
 
 \[a^n\,b^n\]
 
-\noindent meaning strings that have the same number of $a$s
-and $b$s. You can try, but you cannot find a regular
-expression for this language and also not an automaton. One
-can actually prove that there is no regular expression nor
-automaton for this language, but again that would lead us too
-far afield for what we want to do in this module.
+\noindent meaning strings that have the same number of $a$s and
+$b$s. You can try, but you cannot find a regular expression for this
+language and also not an automaton. One can actually prove that there
+is no regular expression nor automaton for this language, but again
+that would lead us too far afield for what we want to do in this
+module.
+
+
+\subsection*{Where Have Derivatives Gone?}
+
+By now you are probably fed up with this text. It is now way too long
+for one lecture, but there is still one aspect of the
+automata-regular-expression-connection I like to describe. Perhaps by
+now you are asking yourself: Where have the derivatives gone? Did we
+just forget them?  Well, they have a place in the picture of
+calculating a DFA from the regular expression.
+
+To be done
 
-\section*{Further Reading}
+\begin{center}
+\begin{tikzpicture}
+  [level distance=25mm,very thick,auto,
+   level 1/.style={sibling distance=30mm},
+   level 2/.style={sibling distance=15mm},
+   every node/.style={minimum size=30pt,
+                    inner sep=0pt,circle,draw=blue!50,very thick,
+                    fill=blue!20}]
+  \node {$r$} [grow=right]
+  child[->] {node (cn) {$d_{c_n}$}
+    child { node {$dd_{c_nc_n}$}}
+    child { node {$dd_{c_nc_1}$}}
+    %edge from parent node[left] {$c_n$}
+  }
+  child[->] {node (c1) {$d_{c_1}$}
+    child { node {$dd_{c_1c_n}$}}
+    child { node {$dd_{c_1c_1}$}}
+    %edge from parent node[left] {$c_1$}
+  };
+  %%\draw (cn) -- (c1) node {\vdots}; 
+\end{tikzpicture}  
+\end{center}
 
-Compare what a ``human expert'' would create as an automaton for the
-regular expression $a (b + c)^*$ and what the Thomson
-algorithm generates.
+%\section*{Further Reading}
+
+%Compare what a ``human expert'' would create as an automaton for the
+%regular expression $a\cdot (b + c)^*$ and what the Thomson
+%algorithm generates.
 
 %http://www.inf.ed.ac.uk/teaching/courses/ct/
 \end{document}
@@ -935,3 +1579,5 @@
 %%% mode: latex
 %%% TeX-master: t
 %%% End: 
+
+